Proper way to build and install NuPIC from source?


#1

I have been trying to streamline my build process for NuPIC Core and NuPIC, and to eliminate the need for elevated permissions where possible. Being a newbie to both Python and NuPIC, I figured I should get some input from more experience folks.

My process so far is:

pip install --user --upgrade setuptools wheel
pip install --user pycapnp==0.6.3

git clone -b 1.0.6 https://github.com/numenta/nupic.core.git
cd nupic.core
NUPIC_CORE="$(pwd)"

pip install --user -r bindings/py/requirements.txt

mkdir -p $NUPIC_CORE/build/scripts
cd $NUPIC_CORE/build/scripts
cmake $NUPIC_CORE -DCMAKE_BUILD_TYPE=Release -DPY_EXTENSIONS_DIR=$NUPIC_CORE/bindings/py/src/nupic/bindings

make -j4
sudo make install

cd $NUPIC_CORE
sudo python setup.py develop

sudo -H pip install nupic==1.0.5

I’m hoping to use the “–user” flag instead of sudo on the pip install nupic step. This will require installing the bindings without sudo. However without sudo, that step results in the following error:

Setup SWIG Python module
running develop
error: can't create or remove files in install directory

The following error occurred while trying to add or remove files in the
installation directory:

    [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/test-easy-install-29777.write-test'

The installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:

    /usr/local/lib/python2.7/dist-packages/

Perhaps your account does not have write access to this directory?  If the
installation directory is a system-owned directory, you may need to sign in
as the administrator or "root" account.  If you do not have administrative
access to this machine, you may wish to choose a different installation
directory, preferably one that is listed in your PYTHONPATH environment
variable.

For information on other options, you may wish to consult the
documentation at:

  https://setuptools.readthedocs.io/en/latest/easy_install.html

Please make the appropriate changes for your system and try again.

This gives me a lot of suggestions, but not sure which is the correct route to take in this case. Figured I would see if anyone with more experience might be able to direct me on how this is properly done.


#2

Hi @Paul_Lamb,

I was able to build nupic.core from source and I thought I share this as this might help. Basically, I just followed the instructions at https://github.com/numenta/nupic.core. I assume you were referring to the build from source with incremental updates. The following are the steps to be more specific.

cd myworkspace
virtualenv myenv2.7 -p python2.7
source myenv2.7/bin/activate

git clone https://github.com/numenta/nupic.core
cd nupic.core
export NUPIC_CORE=$PWD
pip install -r bindings/py/requirements.txt

mkdir -p $NUPIC_CORE/build/scripts
cd $NUPIC_CORE/build/scripts
cmake $NUPIC_CORE -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../release -DPY_EXTENSIONS_DIR=$NUPIC_CORE/bindings/py/src/nupic/bindings

make -j3
make install

cd $NUPIC_CORE/build/release/bin
./cpp_region_test
./unit_tests

cd $NUPIC_CORE
ARCHFLAGS="-arch x86_64" python setup.py develop

# To test the bindings I did an import

$ python
Python 2.7.10 (default, Jul 30 2016, 18:31:42)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nupic.bindings

These steps were run in OSX so I have to set the ARCHFLAGS. By default, unless not possible, I use a virtual environment to contain all python packages in one place. The virtual environment can also be specified which python interpreter to use (e.g. myenv2.7/bin/python). This makes things more predictable as in some environments there will be several python binaries installed.

In my case I do not have to use the pip param --user, as I already know where my packages will be installed because I was using a virtual environment. I do not also need to use sudo when building/installing because the cmake tool already has set an install directory that does not need elevated perms which is the -DCMAKE_INSTALL_PREFIX=../release directory.


#3

Is virtualenv a Python-specific thing? I assume when using this, all packages are specific to that virtual environment (meaning whatever Python program is being built, all of its dependencies would be installed into the environment). This sounds like the idea behind Docker. Is this strategy typically used by Python developers, or is the --user flag more common?


#4

Is virtualenv a Python-specific thing?

Yes. In fact it is the default and most endorsed/recommended virtual env creator.

I assume when using this, all packages are specific to that virtual environment (meaning whatever Python program is being built, all of its dependencies would be installed into the environment).

Correct. However, the virtualenv tool is not limited to only use the packages/dependencies that are contained in its installation directory. One can specify it to use system packages when necessary. For example, if one has a virtualenv A but would need to use a python system package in /usr/local/lib/python2 for some reason, one can specify the virtualenv to use these system-installed packages. There are cases where system-installed packages are necessary to be included, however, this is not a common case.

This sounds like the idea behind Docker. Is this strategy typically used by Python developers, or is the --user flag more common?

It is similar to the idea behind docker in the aspect of isolation and the benefits this isolation pattern brings. The virtualenv, in a nutshell, is a tool to isolate or contain a set of python packages for an application in a directory. Again, there are exceptions to this for example using a system package outside the contained directory. One important parameter of the virtualenv is the python interpreter -p, in that one may specify which python interpreter one may specifically use in its entire application. For example, doing a virtualenv myenv -p python2.7 will create a directory myenv which will contain all the base packages, and the python2.7 interpreter. The succeeding installs will use myenv, hence, everything will be contained in myenv. This directory in most cases is self-contained, in fact one may even physically copy-paste this directory, and everything will still work as normal.

I’ve been using python in different fields in computing for ages now, all I can say is that with this experience my observation is that developers rarely use the --user parameter.

I would highly recommend using the virtualenv tool and make it a habit to use it, besides it is very simple to use, it helps in containing code which brings a lot of benefits in app dev, and most importantly it is highly supported in the python community.


#5

Thanks for the advice.

One more question, do you have any experience with Python wheels packaging? (any advice if you do?)
Planning to do some tutorials on it this week.


#6

No worries.

In my experience in building python apps including scripts, most of the time I did not need to use a wheel or egg or some distribution method. This is because the nature of the apps I was building did not need to be distributed in these formats. When I wear my devops hat, most of the time these distribution methods are not really necessary for internal apps to be deployed in the cloud. A simple requirements.txt that contains all the required python packages are usually enough for applications, even more because of docker, everything is contained in an image and the focus on app distribution is at the container level. However, if you need to distribute your package in pypi so people can install your code with fewer system dependencies, then I’d recommend use the wheel. Please note though that python is not very restrictive in many things (excluding the indentation).

Some readings that might be helpful

https://packaging.python.org/discussions/wheel-vs-egg/
https://packaging.python.org/
https://pip.readthedocs.io/en/1.1/requirements.html

The zen of python which is a helpful guide for daily python coding

Hope this helps.


#7

Definitely seems (at least from my current biased perspective) like Docker would be the obvious way to distribute something like NuPIC which has a number of specific version requirements and a native component. Figured it is worth exploring the Python way of package distribution though.


#8

My two cents:

The greatness of docker is that it avoids the ‘well it works for me on my machine’ disease. I ran into a problem just this evening where a wheel file had a few lines written in the wrong syntax style messing up an install halfway through.

So until Python really settles in a single format, docker is the way to go.