There is definitely something wrong with your NuPIC installation. I know you are on Ubuntu 15, and you said you were using gcc 4.7. Are you on a VM? At this point, I would suggest that you start fresh using the latest NuPIC code that works with gcc 5.2.
You are ahead of me, so I don’t have a valid wiki document on how to install on Ubuntu 15. You could use this one but ignore the stuff about switching GCC versions. If you get any errors during the installation at all, stop and paste them into pastebin or gist (not in this thread) and link to them here. I’ll try to help.
I don’t think this is necessary, because the current location of the nupic-default.xml file is the default location NuPIC will look for it. I remember updating this a couple years ago. Hopefully this is still the case?
The directory ‘/home/ubuntu/.cache/pip/http’ or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo’s -H flag.
kaustavsaha@ubuntu-precision-server:~/nupic/examples/opf/clients/hotgym/simple$ gdb -ex r --args python hotgym.py -u
GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10
This GDB was configured as “x86_64-linux-gnu”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.
For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from python…(no debugging symbols found)…done.
Starting program: /usr/bin/python hotgym.py -u
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
[New Thread 0x7ffff25cb700 (LWP 25223)]
[New Thread 0x7ffff1dca700 (LWP 25224)]
[New Thread 0x7fffed5c9700 (LWP 25225)]
Program received signal SIGSEGV, Segmentation fault.
__memset_sse2 () at …/sysdeps/x86_64/multiarch/…/memset.S:78
78 …/sysdeps/x86_64/multiarch/…/memset.S: No such file or directory.
@csbond007, you said the problem went away after you switched to AWS Ubuntu 14 with 4 GB.
What about Ubuntu 15? Can you confirm that the tests run fine on Ubuntu 15 (no memset SEGFAULT, etc.)?
I am experiencing exact same SEGFAULT with the same backtrace and python run_nupic_tests.py -u hang behavior on Ubuntu 16.04 running in VirtualBox on OS X El Capitan. I configured it with lots of RAM - anywhere between 4GB and 12GB, and that didn’t help.
The real root cause, as I discovered during my work on the manylinux nupic.bindings wheel, was due to the unanticipated confluence of runtime symbol preemption and c++ ABI incompatibility between nupic.bindings and pycapnp. I am going to document it here for “posterity”:
When you encountered the SEGFAULT, you were running on Ubuntu 16.04, whose system headers and libraries were created using the updated c++11 ABI.
capnproto sources in pycapnp are compiled upon installation using the Ubuntu 16.04 toolchain with those new c++11 ABI
nupic.bindings was obtained either from from Numenta’s S3 or built elsewhere on either Ubuntu 12.04 or 14.04, using an older c++ ABI that is incompatible with the one on Ubuntu 16.04 (where pycapnp was built).
nupic.bindings includes its own copy of capnproto c++ sources that is very close to the one included in pycapnp’s python extension. However, nupic.bindings’s capnproto c++ code was compiled as part of nupic.bindings using the older toolchain.
Neither pycpanp nor nupic.bindigns were hiding their symbols, so all the symbols in pycapnp and nupic.bindings extensions were public, including the similar capnproto symbols, subjecting both libraries to symbol preemption during runtime linking.
Notice in the stack trace quoted below that control from the destructor capnp::SchemaLoader::Impl::~Impl in pycapnp extension (capnp.so) is inadvertently transferred to methods compiled into the nupic.bindings extension _math.so. Recall that pycapnp’s capnp.so and nupic.bindings’ _math.so were compiled on different platforms using incompatible c++ ABI. This explains the SEGFAULT on Ubuntu 16.04 (pycanp and nupic.bindings extensions were compiled using INCOMPATIBLE c++ ABI) and no SEGFAULT on Ubuntu 14.04 (pycanp and nupic.bindings extensions were compiled using compatible c++ ABI)
As part of the manylinux wheel effort, I have taken several steps to alleviate this issue in nupic.bindings build:
Hide all symbols in the nupic.bindings extension DSOs (except the python extension initialization function, of course) on *nix builds. This prevents unintended preemption of nupic.bindings symbols by other extensions and vise versa.
Exclude capnproto sources from nupic.bindings extensions build and forcing preload of pycapnp, thus forcing a single capnproto build to be used. This solves the “hang” problem that both this thread’s author and I ran into. That problem resulted from an object created by pycapnp’s capnroto code (compiled on Ubuntu 16.04 with newer c++ ABI) being manipulated by nupic.bindings’ capnproto code (compiled on a system with an older, incompatible c++ ABI). It’s easy to see how this would lead to problems.
Link nupic.bindings extension DSOs with static libstc++. In combination with the first item above, this ensures that nupic.bindings extensions can catch exceptions whose c++ ABI changed in newer toolchains (e.g, std::ios_base::failure). See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66145 .
NOTE Step number 2 above is only a short-term solution as it relies on “promiscuous” behavior by a 3rd party python extension (pycapnp) that exposes all its symbols against python extension best practices. We will of course need to find a more robust solution for the long-term.