Requirements from Serialization

breznak · January 19, 2018, 2:05am

Currently:

nupic.core and nupic provide CapNProto serialization for most classes/regions. There is also YAML (where used?)

@chhenning made (temporarily) removal of the feature in his fork, because it caused code complications.

So, do we need serialization? What for? Which framework to use?

yes, but for me it’s too complicated.
For me:

User runs a HTM model, serializes and continues the next day.
User trains HTM; reads the model and re-runs on multiple (test) cases.

EDIT: Poll

cross platform (Linux, Win, OSX)
cross languages (C++, Py,…)
little intrusiveness (capnp introduces at lot of code and compiler warnings)
low complexity (e.g. not compiling *.capnp files)
header only (cpp only)
c++11 or later
text (zipped)
binary
fast dump (aka memory mapped file; speed)
I don’t need Serialization
any working (basic) serialization is OK
commonly used (high user base)

0 voters

chhenning · January 19, 2018, 2:06am

So no cross platform?

breznak · January 19, 2018, 2:17am

So no cross platform?

Cross-platform would be nice to have, probably not necessary. For me, I don’t know what others do.
In future-proff vision it’s a solid requirement, as you can train models on a cloud/farm, and then distribute it.

Why does it matter? Do you have a framework that is platform dependent?

chhenning · January 19, 2018, 2:22am

What do you mean by framework?
The easiest and fastest would be a shared memory file. Basically just dump the current process memory and reload when necessary. Obviously that’s not cross platform.

breznak · January 19, 2018, 2:25am

Moving this here.

Also,

What do you mean by framework?

Cereal, capnp, yaml …serialization libraries.

chhenning · January 19, 2018, 2:40am

Should we vote on the following features:

cross platform
cross languages
intrusiveness (capnp introduces at lot of code and compiler warnings)
complexity (e.g. compiling *.capnp files)
header only (cpp only)
c++11 or later
text (zipped) or binary
fast dump (aka memory mapped file)

David_Keeney · January 19, 2018, 3:01am

Yaml and CapnProto are both implemented everywhere in the core, side by side. Yaml being text based (think superset of JSON). CapnProto is a binary serialization. Both work as far as I know. Yaml being a little slower. The Zip packaging is there as well I think. I know we link with it.

Both are cross platform and cross language. Nether is header only but I don’t think it worth the trouble to rip out both of what we have and implement a new one. Besides, a header only implementation would only work on C++.

The idea of a memory mapped file is one that I have been thinking about as well. All data is automatically saved, even if the program crashes. No startup time and no shutdown time. But perhaps slower execution to perform the disk io in the background.

This would require doing our classes quite differently…For example, we cannot allocate from the heap for each class. (cannot use new) We would have to allocate a fixed block of space and map it to disk. Then we have to write our own memory allocation for classes. This is doable and there are probably some tools out there for doing this but it would take some re-thinking of the algorithms.

However, I don’t think this level of ‘Saving’ is needed at this point. Perhaps when the algorithms mature to the point were we can actually write a program that implements a true AGI, then it would be important. We are a long way from that point.

breznak · January 19, 2018, 3:34am

added a beautiful Poll, go for it

chhenning · January 19, 2018, 3:35am

@David_Keeney What is a AGI?

breznak · January 19, 2018, 3:36am

Artificial Generic Intel… human like and above.

David_Keeney · January 19, 2018, 2:53pm

Artificial General Intelligence (https://en.wikipedia.org/wiki/Artificial_general_intelligence)

chhenning · February 9, 2018, 12:05am

Came accross google’s flatbuffers. I think it’s worth looking into.

http://google.github.io/flatbuffers/

breznak · January 31, 2019, 1:51pm

So we ended up implementing simpler, c++ only, cross-platform serialization using bitstreams (and ditching capnp). It is quite easy to implement serialization for your new class with this framework. You can see https://github.com/htm-community/nupic.cpp/blob/master/src/nupic/types/Serializable.hpp as the Serializable interface that is required.

Topic		Replies	Views
Serialization for htm-community/nupic.cpp (the Community version of nupic.core) Engineering	8	650	August 19, 2018
New Serialization Plan NuPIC serialization	0	969	April 5, 2017
If you serialize models, you need to update to NuPIC 1.0.4 NuPIC warning	0	314	April 12, 2018
Cap'n Proto NuPIC	2	890	November 29, 2017
How to deal with pycapnp / Capnp dependency in nupic.core NuPIC serialization	3	1316	August 18, 2016

Requirements from Serialization

Related topics