I got 128 features as float32 values, I wish to encode them as 2d input grid of 128, 2000 where 2000 is the length of the bit array, it this the best way of doing this or is there a better way two encode many features as input to spatial pooler.
This is way too many for 1 normal sized SP (of 2048) columns. Usually its recommended to limit the number of fields to one SP to just 5 or so, at least for the usually size of 2048. I suppose you could up that number, but I think 128 is too many regardless.
I’d say try and either do dimensionality reduction to reduce that 128 number, or break it down into multiple HTMs which can run in parallel.
Thank you for your reply , thats what I ended up doing, used TSNE to reduce them to 3, could we have a have a hierarchy of SP, as in 5 per SP, we get 26 SDR total , then pipe the output of those to another SP.
FYI I am using htm core as oppose to nupic. I think that is the one actively developed for python 3
If all the info from those 26 SDRs is then encoded into one big SP, I’m not sure that mitigates the problem of too much information packed into 1 SP. Each SP column has to represent some spatial trait(s) of whatever encoding comes in, so if that encoding is too big compared to the size of the SP, each column gets stretched thin with too many spatial traits to represent.
The only ways around this issue I know of are either compress the encoding space or split it up into multiple HTM regions as you’re doing with the 26. I’d create a logic on the outputs of all 26 at each time step, maybe tracking which breach a certain anomaly score/likelihood at each tilmestep.
Thanks again for your reply , I shall try to flollow what you suggested and keep you posted , the other question I had what is keeping SP from scaling up to many features , it this a algorithimic issue or the way HTM is implemented in htm core or nupic, cause real world data can have hundreads of feature, it is shame that such an eligent frameword cannot scale beyond toy problems.
you misunderstand 26 SDR pack into let’s say 12 SDR which packs into 6 SDR, which packs into 3 and then finally one SDR, therefore a hierarchy of SDR, each passing through its own layer till we get the final outcome, each SDR, reducing sparsity till we achieve %2 .
I think this 26 --> 12 --> 6 -->3 -->1 means making a pooling of a pooling of a pooling of a pooling. This would seem to be cramming a much higher dimensional feature space into a smaller one – still giving those SP columns in the final ‘1’ SP a LOT to cover.
I don’t mean to discourage you here! It just wonder if that last SP will be able to hold on to enough of the relevant features present in the original 26. I’d be very curious to see the performance.
Me too, however I feel if HTM has to go beyond toy projects , both SP and TP need to scale to thounands of features, and that need to be incorporated in the framework itself, I understand that this might require use of GPU, or cluster of servers and GPU like tensorflow or pytorch ,(when it comes to solving real world problem they are not cheap either ) one way or the other this framework needs to scale. We are trying to replicate high order thinking ( as in the neo coetex) it not going to be cheap, I totally get it,
Let me put this another way, one high order thought in human brain needs to translate to 1 > 4 > 8 > > 16 > 32 > 64 so on and so forth ( ultimately translating to low order commands), if we cannot make the trip going one way we certainly cannot make this trip the other way.
I’m definitely interested to see where this might go. I personally have a use case where I have a time series with each sample consisting of a 253x253 directional interaction matrix where each point (a, b) is a decimal with range of [-10.0, 10.0], resolution of 0.1, that encodes the relative intensity of an action a takes on b. As a result (a, b) does not have the same semantics as (b, a) which precludes reducing along the diagonal of the matrix. The matrix needs to evaluated as the state of an environment where for example interactions between (a, b) could be affected by interactions between (a, c) and (d, e). This amounts to 64,009 values I need to encode in each input and receive as output for TM prediction of the next interaction matrix.
I see examples of image encoders that return SDRs of 16,384 data points and am wondering where the upper limit is on the input SDR size (and output for prediction). Is it a computational issue or algorithmic limitation?
It does work you can take the output of one SP and feed this into anther SP, in my case the sparsity level in SP 1 is 4%, the second SP is around 2%, logically the output of one SP is can be thought of a Randomly encoded scaler encoder, I do not know about 16K feature, in my case I got 128 dimensions, which I pipe through TSNE to reduce the dimensions to 3, the encode to 3 X 2000 (with 2% sparsity) bit matrix as input to first SP.so far I am getting good results, I tried both Scaler encoder and Random Scaler encoder the second one seems to work better., my suggestion would be to reduce the dimensions from 16 K to something more manageable.
Let us know if you folks find anything interesting. I think that to handle realistic amounts of input fields, we will need a revolution in computer chips, but it is coming.
Do we necessarily? What about a revolution in applying multi-region & multi-column HTMs? This seems to be what’s missing in order to scale to this many spatial features w/out splitting into separate models.
I also think this is a big part of what is needed as well. We need a good algorithm for an object layer which can be used to enable long-distance connections and voting across multiple cortical columns (each of which can be specialize in more focused areas of the overall semantic space). This is also the area I am currently working on, but it is proving to be a lot more complicated than it sounds at first.
In HTM is there a concept of epoch ( as in deep learinig ) , would it make sence to pipe same set of data through SP twice to pomote better learing.
I am getting some very good results , so the final input to SP is 3 dimensions redused via TNSE from 128, and 4th dimension as magnitude of (128 dimensions ) so the final input to SP looks like 4 * 2000, other techinique to reduce dimensions is UMAP, however it come with many tunable hyper parameters if anyone has exprience with UMAP please let me know.
Ruiing the SP through 3 epoch, and then collecting final results.In an unsupervised way getting the over lap score , .i.e. similarity.