Guys, I would like your opinion.
Background: A few of us hackers are in the process of re-working the code in htm-community/nupic.cpp. The plan is to separate the C++ library from the Python so it can be the bases of C++ only implementations and provide a cleaner interface for language interfaces. The third party utility libraries are being replaced with std:: libraries in C++17 and no (or almost no) dependency libraries. The directive is to retain the documented API and not change any HTM algorithm logic.
The Python interface would be implemented in a separate parallel repository using pybind rather than SWIG, supporting Python 2.7 and Python 3.6+. Additional programming languages (such as CSharp) could be added as parallel repositories as needed without changing the core C++ library.
A few years ago capnproto was introduced into nupic for a way to do faster serialization. It did work but it was very invasive and difficult to maintain. In an effort to simplify all of the modules, we are removing capnproto in favor of binary stream serialization (writing directly to a stream). This means the saved files will not be cross-platform compatible but they should write/read as fast as capnproto without the complications.
My current plan is to have the Serializable class as a base class for all classes that use the streaming serialization. It contains the following virtual functions:
-
void saveToFile(path)
– contains default code to create ostream and calls save() -
void loadFromFile(path)
– contains default code to create istreadm and calls load() -
void save( ostream)
– subclass must implement -
void load (istream)
– subclass must implement -
void write(ostream)
– calls save(ostream) as backward compatibility with capnproto -
void read(istream)
– calls load(istream) as backward compatibility with capnproto
The problem is with the Network Class. The API defines:
-
void save(path)
– creates Bundle directory writes Network & Link Classes to one file and each RegionImpl to its own file in the Bundle directory. -
Network(path)
– a constructor that loads from the Bundle directory and finds all of the parts.
It also has a write(ostream)
and read(istream)
for capnproto which streams everything to one file (or memory stream). The problem is that this file can get very large.
What I was thinking was to keep all four functions and perhaps to be consistent with serializable, add read(istream)
and load(ostream)
that do the same thing as read(istream)
and write(ostream)
. Or should we brake the API in this case and provide only the streaming save(ostream)
and load(istream)
for the Network object with overloading for <<
and >>
operators.
What are your thoughts?
P.S. Anyone interested in helping in this project are welcome to participate.