This meeting was called to discuss moving forward with an extraction of the “core” NuPIC API C++ codebase from the current
nupicrepository, which was the #1 request of the NuPIC community when polled.
Goals of the Meeting
- Define technical tasks required for defining and extracting core components within NuPIC
- Concrete, tangible tasks for the advancement of the API and the extraction of nupic-core
- Subutai Ahmad
- Scott Purdy
- Austin Marshall
- Matt Taylor
nupic-core: refers to the C++ codebase within NuPIC, which will be extracted into its own repository and contain its own API
nupic: the current NuPIC codebase as it is today, containing all C++ and Python code.
- Extraction of C++ Core into
- Definition of
- Renaming terms
- How to handle encoders during extraction
- Standardized serialization
CLA as a technology is applicable to many industries and problem domains. The goal of nupic-core is to support as large a set of possible environments, languages, and platforms as possible. As such,
nupic-core should contain only C++ code. It will contain reference implementations of the core algorithms within NuPIC. It should have as few external dependencies as possible (apr and boost are two that we rely on today). There will be a high bar for code changes within this repository, as it will be the central engine of NuPIC, and any changes (no matter how insignificant) could have an impact on the performance of the CLA.
nupic-core will be designed so that language bindings for other languages can be created easily.
nupic-core itself though will not contain these bindings.
C++ Core Extraction
C++ Core currently contains:
- support routines (sparse matrix libs, os independent code (such as a timer class), etc.)
- reference spatial pooler implementation
- Network API (Link, Region, etc.). These are described here
Some things that currently exist in python code should eventually be translated into
nupic-core as pure C++:
- reference “temporal pooler” implementation
- core encoders: some of them are general enough to exist within
nupic-core, and will likely be reused by many client projects.
- OPF and CLA Model (??), currently in python
Details of the Initial Extraction
- temporal pooler would still be in python
- build system would be more complex because of dependency issues
- how do we update CI and define a passing build?
- python codebase (
nupic) will still need to compile
nupic-coreand run integration tests against it
nupic-coreneeds its own set of tests around the official API
- Could be done in a few days
ntadir has all C++ (the core)
- python bindings are in
- nupic-core needs its own set of tests
See NuPIC Core Extraction Plan for details.
Users can provide their own TP or SP in whatever language they wish.
There will be both a “high-level” and “low-level” API for
nupic-core. At the high level, users will be able to define their own Network with the Network API. But for users who don’t require hierarchy or don’t want to use the Network API, there will also be a lower-level API with direct access to CLA components like SP and TP.
This API should support use-cases of the entire community, including distributed computiation and flow-based programming models. Anyone should be able to create their own
nupic-core client in C++, or create language bindings for different runtimes.
- Want to enable users to write their own encoders in whatever language they want
- Regions are pluggable (can write in language of choice as long as bindings exist)
- Many encoders are very specialized and don’t belong in
- Regions work in a self-discover fashion
- Encoders must be hard-coded in an init file currently
- Scott wants to allow importing encoders to pass into nupic-core instead of a discovery system
Network can be saved (network.save()), which calls each Region’s save routine
- Each Region is responsible for its own serialization / deserialization
- Currently the PyRegion implementation pickles itself
Region implementation must define serialization
Longer term, we need a language-agnostic serialization format within nupic-core
As long as a Network is serialized using standard encoders included in nupic-core, it should be transportable. If user provides custom encoders, it will not be transportable.
- Would make the build simpler
- Like a meta-make
- Creates different developer environments
- We want to consider switching the build system to be CMake based
Before we solidify the nupic-core API, we should use this opportunity to make name changes.
- Temporal Pooler might be renamed
- Region in Network API might change