Proposal to introduce pybind for move toward Python 3 compatibility

pybind11 doesn’t claim to be a replacement for swig. It’s really just a nice way to avoid native C python API calls and as such it’s in “competition” with boost::python. I honestly like it better than boost::python which needs to be compliant with old c++03 code.

The main purpose is create python c++ extensions but it also provides ways to embed python calls inside c++ code.

Hope this explains it.

Here is another explanation:

So do you imagine keeping swing or replacing it with pybind?

I have to understand a little more about nupic.core. But classes like PyArray, WrappedVector, and all the utility classes in PyHelpers.cpp look like they could be replaced with pybind11 equivalents.

If this is going to take a considerable amount of code changes (sounds like it will), you might think about creating a community fork so you don’t have to wait on nupic.core committers to merge your PRs. We are only planning on working on nupic.core to support ongoing research. It doesn’t change much, so you might be able to keep a community fork updated without too much trouble.

A post was split to a new topic: What is a community fork?

Here is a small example on how to exchange data between cpp and python. Imagine the Matrix class as nupic’s SparseBinaryMatrix or similar.

Please note the following:

  • using the pybind11 header only cpp lib
  • it doesn’t matter what branch of python is being used (2.7 or 3.6). The python c api is included and linked to when the module is generated
  • no need to define numpy interface classes, like nupic’s NumpyMatrixT, etc.

Good news, I have finished migrating the to python 3. For that I defined a new python module using pybind11 (calling it nupic_ext for now) and also made some minor code changes to the python 2.7 code.

Here are a few take aways:

  • the module can be compiled with python 2.7 and python 3.
  • no need for swig interface files, like sparse_matrix.i or any of Python’s C API functions
  • no need for any code changes in nupic’s core source code.
  • still no need for any code from the nupic’s py_support folder, like NumpyVectorT, etc.
  • using pybind11 allows the module definition to be included in a native cpp project, like this
    • that means, the module definition, module loading and debugging/testing can all be in the same project! It’s just c++11

The next steps are to migrate more tests and swig files and of course, to clean up the code…

Any comments are welcome.

@chhenning I’ll grab the latest from your fork. I had it working the other day… curious to see the updates.

@David_Keeney is still working on updating his fork as well - he’s trying to get a VS 2017 solution and update to Python 3. There’s obviously some overlap between you two.

He’s trying to get core updated to work with VS 2017 and then hopefully we can add some language bindings for C#, Java, Ruby, etc. This work sounds like it could simplify the setup process.

@heilerm Please note that I have created a new branch “pybind11”.

In general, as a MSVC user would like to have the binaries part of the repository so you don’t need to build anything yourself?

Using CMake, I think I would prefer to exclude binaries from the repository and instead configure the package registry as a solution for Windows. Other platforms could just use straight-up find_package. I would imagine core supports this… I haven’t played with it yet. I’m teaching myself with a sandbox at the moment, piecing everything together, bit-by-bit. This is all new to me.

@heilerm There are already instructions to get Windows build with cmake.

It’s not quite working anymore, I believe, but it should be a starting point, no?

A manifest would have to be defined. Also, nupic patches a few dependencies, which requires patch.exe - not available on Windows without GNU… so, that’ll have to be addressed. Would not a newer version of that dependency suffice? A separate distribution?

If you mean the build binaries… I guess the common approach is to use the CMake generator.

Yeah, it’s not working for me. @David_Keeney seems to be making progress, though. I’m working on a sandbox to see if an alternative solution can be offered that’s a little more Windows friendly, but the Python dependencies kind of put a damper on that. Trying to find the time to work on it. Not sure that I’m doing much more than playing at this point.

I’m a noob with cmake. But I assume you will be using vcpkg which needs to be installed during the cmake process. Why not just install patch as well?

By the way I have not patched anything when I created my Visual Studio solution.

CMake wouldn’t install vcpkg, I’d imagine. Not sure that’s the concern of that utility. That would be a solution for Windows package management, instead of apt-get. The repository CMake config could be updated to work with CMake’s find_package using the package registry. I think nupic already does this - or can do this - on unix systems. The manifest would update the CMake config to look for a package of a certain version and issue a warning if it’s a different version. That’s how I understand it… at the moment, at least. Still working on it, and looking at what nupic has.

Basically… CMake is configured to look for packages at a couple built-in or configured locations. CMake could be configured to point where vcpkg drops packages… or apt-get, etc.

Oh, yes I have mixed up cmake with the CI tool Appveyor. Have you considered instead of using your own sandbox just to create a appveyor script? The integration with github is very nice and it seems easy to setup a Windows machine with the right tools before you start with cmake.

nupic already relies on some CI tools for other platforms like Linux and Mac.

I haven’t look at it yet, but certainly will, although it looks like that’s more for CI than strictly development, which is where I’m at this moment. I thought I’d have the sandbox on Github last weekend, but… priorities. I’ll get it up soon. Very simplistic at the moment.

Also, I would love to see if there’s a pure C++ solution that could be offered. I’m sure it’d break current dependent packages, but if offered as a community fork and offered a migration path… I’m sure there’s a well-supported library that offers what numpy has, only in C++ without the interop. Not sure why pycapnp is still required yet and not just capnp.

@chhenning This sounds interesting. There may be some overlap with what the three of us are doing.

What I was trying to do was get it all to build using MSVC 2017 with all of the latest dependencies which would then allow us to support Python3. It turns out that it is not that simple. I am making progress but it is slow slogging through the compile errors in the Swig generated C++ code. And I have not even gotten to the bindings yet. My thinking is that it is best to have a baseline that we know works on MSVC 2017 before we start modifying it to add the other language bindings. I don’t think I am very far away from getting this working but it is slow.

As I understand it, @heilerm is starting with the raw C++ code (without all of the Python stuff) and getting that to build. I know that the python is embedded deeply into the C++ code. I think once I have my base working I will start working with this approach and see if we can condense the nupic.core code down to just C++. I would like to get rid of most of the dependencies. It is just way to complicated.

Your approach I think is to go directly to the existing C++ code, bypassing the Swig files using pybind11. Unfortunately pybind11 can not be used to interface other languages like C# which is our objective. But it is good to know there is a way to support Python without SWIG if we decide to use something else for C#.

Yes, it’s just a foobar library right now. Some simple CMake and project structure stuff. I’m figuring out how best to setup swig to generate wrappers from the interface bindings. Eventually, adding all the necessary infrastructure used in nupic today. Appveyor, etc. Then, plop in the C++ source and header files, hopefully extracting all the python stuff in that process.