Thoughts about topology

sebjwallace · December 4, 2016, 10:10am

It makes sense that the local algorithms are only useful for a certain set of problems, but they are important problems. However, I believe it would be a good idea to optimize, or at least look for different ways of implementing things, because only being able to work on a 64x64 image limits being able to experiment with hierarchies on detailed spatial data.

There’s a way to optimize a HTM implementation by using a ‘propagation algorithm’. There is very little high-level operation - its mostly all local interactions between Cell and Dendrite/Segment objects. All the cells and dendrites are connected by object instance references. When a cell gets a feedforward propagation from a dendrite (segment activation) the cell then propagates forward to all the segments that connect to it (via the ‘axon’). When the segments reach their threshold they then forward propagate to their target cells.

The same general idea is used for local inhibition. When a cell propagates to segments connected to its ‘axon’ it can also propagate negative feedforward values to neighboring cells - which causes sparsity and competition. The structure of columns emerge from the schematic of cell classes/layers and local connectivity.

The benefit of local computation is that the limit on the number of cells and segments you have is based on computer memory capacity, not CPU/GPU processing capacity. No matter how many object instances you have (i.e gigabytes) the propagation algorithm will compute very fast as it only processes the sparse activated cells. Most cells are inactive (so therefore segments too) so there is very little iteration within each feedforward step. If there were 4098 cells in a region then only ~164 cells (4% sparsity) will need to propagate (be processed) in each step.

I found this works quite well, except the initialization of the region takes some time to construct all the objects. But from there it’s light.

Anyway, Sunday morning blabber. Need to eat breakfast!

SimLeek · December 4, 2016, 3:37pm

True, but even then, with a million columns (enough for input from a typical computer monitor), 4% activation gives 40000 active columns, and a temporal pooler just beginning learning might activate all 8 cells in many of the activated columns which could strengthen/weaken 4 connections per cell (guessing value via this post), performing operations on 1,280,000 segments independently. If I choose the method of adding up activation input values for each column and have 100 local input locations per column, that’s 100 million floating point additions.

For storage, the temporal memory would require connection strength to be stored for each connection, which means 8 million cells times about 4 segments per cell, giving 32 million strength values, or 128 MBytes of storage if 32 bit floats are used. Meanwhile, the spatial pooler has 1 million columns with, depending on input radius, around 100 connections per cell, which requires about 400 MBytes of storage. If all that’s right, it comes up to about 500 MBytes of storage.

That means my laptop’s GeForce 940MX should be able to store about four of those million column networks on GPU memory. It should also be able to handle 100 million floating point additions per second, which is way under the tens of Gigaflops most GPUs are capable of performing. (I think that means it should be able to run the HTM layer at 100 FPS, even using an un-optimized column activation method.)

Whew, that was fun! Now I need to eat breakfast too.

Edit: I think this means I could theoretically run a 4-layer, 1 million column per layer, 8 cell per column, HTM network on my laptop.

rhyolight · December 4, 2016, 4:27pm

That makes sense given the memory footprint of the HTM apps I run.

Bitking · December 5, 2016, 1:43pm

Assuming your figures are correct …

Going further - 10 fps is about what a human does in dealing with the world.

If you can run 100 fps with one area then you should be able to run 10 layers at 10 fps with a 5 GB footprint; well within the capabilities of modern hardware.

Assuming that not all of your areas need to be a full 1K on a side frame buffer you could run more but smaller areas (say 256 x 256) for a very fancy hierarchical system.

BTW: I don’t see any reason you need to do full floating point; 8-bit synapses and integer math should be enough. Depending on the processing hardware there may be a significant speed-up to be had.

SimLeek · December 6, 2016, 3:46am

Don’t worry, they’re probably off by at least an order of magnitude. It was a ballpark estimate. I can hope though.

However, it’ll be a little while before we see how far off I am. Right now, I’m still writing a visualizer to see what I’m doing with any HTMs I try to make:

https://gfycat.com/FoolishYellowAffenpinscher

Working with VTK was harder than I thought it would be, even in python. I just got animation working though. Now I need a few functions for setting up planes of points, which shouldn’t be too hard, and then I can work on connecting that with TensorFlow and/or Nupic… which could be very hard depending on how successfully I can shove the running code from those libraries into a single vtk callback function…

I used floats for simplicity, but yeah, I could imagine 4x or 8x less data usage and 4x or 8x speed up if the right types are used. If that could be applied to everything, then the number of simulated neurons would probably be pretty impressive.

medhat · April 26, 2017, 1:58pm

This is very interesting… would you say topology represents something like receptive fields? It would be very useful in visual applications for having parts of the spatial pooler (I’m new here… still getting used to the HTM/nupic lingo) detecting local features like orientation or intensity and propagate that higher up the hierarchy to more abstract concepts like shapes.

Do you think the boosting concepts can be used to implement on-center off-center receptive fields similar to this?

I can’t wrap my head around how this would work computationally. That’s probably the biggest bottleneck at the moment. How can we use the distributed processing technologies that are maturing these days? Or offload some of the fixed, expensive processing to an FPGA.

kaikun · February 16, 2018, 10:54am

Hey,

interesting discussion. I am wondering if there are new thoughts on it in light of the new theories and discoveries?

When thinking about a network hierarchy which mainly looks at the input with different scales of perceptive fields and an allocentric location integrated such that every column builds feature and object representation (together with motor movements…), topology seems very important to integrate well into this framework.

I will have to read up the newer papers again focussing on that aspect, but maybe someone can align the theory more clearly.

Kind regards

PS:
Additionally, I saw in the code and running the visualizations that we the SP does not account for columns on the sides/edges to have a smaller potential pool due to the end of dimension.
We can use the wrap option to avoid unfair competitions (due to a smaller active pool -> less connections) for this columns but that barely seems like a quick fix and not really a solution for maintaining topology.
Jeff mentioned in a talk about the tangential theories about e.g. sensors for touch are not uniformly distributed, could a topology account for that? How does it play together with allocentric location on an object?

Topic		Replies	Views
Spatial pooler and 2D input Engineering	1	1011	October 12, 2018
Does topology work in htm.core? NuPIC Community Fork	0	437	November 17, 2021
NuPIC Topology Notes NuPIC spatial-pooling , topology	9	1315	August 23, 2016
SP columns activating without input when using topology NuPIC spatial-pooling , topology	4	1228	August 23, 2016
Details needed on spatial pooler algo NuPIC spatial-pooling	2	679	June 25, 2017

Thoughts about topology

Related topics