I’ve been coding through htm.core’s example and feel like I’m getting a hang of the encoder-SP-TM flow, but I’ve got some questions regarding future development.
- What is the htm.core fork currently “missing” that NuPIC has implemented?
I looked through numenta’s docs; while the networkAPI and modelFactory seem like nice QoL features, it seems the same task can be achieved by putting together the parts manually in core.
They have a wider arranged of specialized encoders, of course - geo-coordinate, adaptive scalar, category_string, logarithmic (float on log scale?), among others. But a large amount of these seem like fairly close extensions of the base ScalarEncoder which has been ported to core already - though that makes it sound much easier than it surely was to create them, of course.
For example, I wonder if you could ‘mimic’ the geo-coordinate encoder with a series of smaller scalars with fixed minimums/maximums, concatenated into one SDR, perhaps allotting more bits for the values of lat/long that represent larger real-world distances.
Core’s MNIST example on github is a fine example of image encoding as well; it’s only black & white so far, but I don’t see why we couldn’t just multiply the encoded SDR length x3 for RGB 3-channels - stacking it linearly instead of stacking vertically like you would an RGB image for a convolutional neural net. Or you could have a same-size array of tuples (R,G,B) instead of an array of scalars (B&W), but I feel like that might not work out so well.
I don’t quite yet understand how category-strings are encoded - docs say it’s a scalar encoder with radius 1 - but Cortical.io had a really cool “retina” semantic fingerprint for a huge vocabulary. I can’t find the page on their website (might have taken it down?) but it allowed encoding of words while capturing semantic meaning and ‘distance’ for bitwise comparison.
The toughest thing I foresee encoding is graphs. Sort of arbitrary in terms of size, connectivity, what data is stored where. I also remember seeing an audio encoder somewhere for base nupic.
- I know Nupic is foreseeably still in maintenance mode, but support for python 2 ended ~9 months ago. I’m quite biased towards wanting to develop anything new in Python 3; is the current state of affairs “we’re applying existing nupic and theorizing/ writing and sharing research for future developments”, or some other paradigm? Do you figure the ‘future’ of Nupic is along these python 3 fork rails?