Some details murky on temporal implementation


#1

I’m trying to implement the temporal pooler in vb.net, and I have some questions:

  1. is the permanence of a synapse to a segment initialized to a small range of values centered around an initial value, or are all permanences initialized to the same fixed value?

  2. The pseudo code does not give values for:
    ACTIVATION_THRESHOLD
    INITIAL_PERMANENCE
    CONNECTED_PERMANENCE
    LEARNING_THRESHOLD
    PREDICTED_DECREMENT
    SYNAPSE_SAMPLE_SIZE
    Can you suggest some good values?

  3. When learning, is permanence-increment/decrement for lateral (distal) connections the same as for proximal connections? (proximal connections are used in the spatial pooling). What values should I use for PERMANENCE_INCREMENT and PERMANENCE_DECREMENT when learning both types of synapse?

  4. In the temporal pooler, I assume learning goes on in both proximal and distal synapses. Is there any reason to separate these processes, for instance, should I teach a batch of patterns spatially (proximal) before I turn on temporal (distal) learning?

  5. Does ‘potential synapse’ mean "having a permanence greater than zero)/ If so, then is a dendrite connected by zero-permanence synapses to all cells in most columns?

Thank you.


#2

A range near the threshold, so that around 50% of them are connected, I believe.

See: http://nupic.docs.numenta.org/1.0.2/quick-start/example-model-params.html

No, I think there are different params for each, see the doc linked above on model params.

I’m going to assume you mean temporal memory because we have no TP implementation documented anywhere yet.

The spatial pooler learns spatial patterns on the proximal input. The TM learns distal patterns.

Both processes are actually occurring simultaneously, but it is easier to explain them separately.

No, a potential synapse is a where a cell has the potential to connect. The permanence could be 0.0 and it is still a potential synapse. If there is no potential synapse, that means there is not permanence at all, not that the permanence is 0.


#3

A couple of additions if I may,

For spatial pooler (proximal synapses) it is %50 (potentialPct in Nupic) as @rhyolight said, but for temporal memory (distal synapses) there is a specific initial value (initialPermanence in Nupic) for all created synapses.

They are different and there is a reason. The columns are in competition in spatial pooler so you want columns (proximal dendrites) to participate in more input patterns rather than specializing onto one. So generally, increment > decrement for proximal connections. On temporal memory, there is no competition among distal segments and a single segment is expected to specialize on a single input pattern. So generally, decrement >= increment for distal connections.


#4

Thanks Matt and Sunguralikaan for your replies. Here are a few more issues I’m encountering:

  1. Can I assume that every cell starts off with say - 20 dendrite segments?
  2. It seems that a huge amount of memory is required to implement termporal memory. This is why:
    If you have 2048 columns * 32 cells / column * 20 segments per cell * 20 synapses per segment, then that means you have over 26 million distal synapses, all of which you have to keep in memory. In my implementation, a synapse stores 3 pieces of information - a pointer to the presynaptic cell, a pointer to the segment it belongs to, and a permanence value.
    I seem to remember reading somewhere that numenta found a more efficient way of doing things. If so, let me know what it was.
  3. The BAMI chapter on temporal memory should incorporate some of the answers to my questions on this thread.

#5

I don’t think the TM starts off with any distal segments. They grow as learning occurs. How many segments per cell seems to depend on the complexity of the input data.

That seems right. I’m not certain exactly how we keep this info in memory, but there’s been some conversation about the best way to group these associations recently:


#7

Here are a few more questions:

  1. To reduce the amount of memory used by my vb.net temporal memory, maybe I can restrict my code to running simple sequences.

    For example: A simple logical operation is XOR.
    The truth table for that is:

    0,0 --> 0
    0,1 --> 1
    1,0 --> 1
    1,1 --> 0
    

    You could think of these lines as sequences, with the answer being the third number in the sequence. It seems to me that you don’t need many columns to represent just a zero and a one. Perhaps 5 columns would be enough.
    Plus, if you had 4 cells per column, you could represent 4^5 different sequences (I’m not sure on the math here).
    So that might be enough for these small problems.
    If I wanted to extend to somewhat longer sequences (for instance, instead of 3 numbers, I might use 8, and the 8’th could be 1 only if the count of the 1’s in the first 7 bits of the input sequence are even, I would use more cells (but not more columns)
    Is that correct?

  2. I would think I need a marker for “start of a sequence”. Perhaps I would start every new sequence with the number ‘2’. Otherwise, if I present two sequences in a row, the second sequence starts in a different state than if it had been presented by itself.
    Is that correct?

  1. If I were to try sequence generation like this, then I am doing the following: I’m presenting inputs to the proximal dendrites (in this case, a sequence of 2 presentations), and then I’m letting the memory make one more transition, which is its ‘prediction’. Then I would have to readout that prediction. And this last transition does not have any presentation of a proximal input. . Would it still work?

  2. The above, if it worked, would not be a logical problem solver, it would just have memorized some patterns. If I could teach it axioms (such as ‘if A then B’) I might be able to make it into a problem solver, as follows.
    I could teach it the truth table for ‘if A then B’ (which is equivalent to NOT-A OR B) and some other axioms such as if (C and B) then D (which is also a truth table, but I’m’ not sure that the system would chain these rules together, since the system is context-dependent. The system depends on what happened earlier. Also, you would have to apply the rules in order, the first rule to get B, and then combine that with C to get D. I don’t think this is a promising avenue, but if anyone has an idea, let me know.


#8

1- Correct, you would want more cells.
2- Correct, this is what Nupic does too.
3- I am kind of lost. Generating sequences? One more step of temporal memory? Maybe you can word it differently.
4- Sounds interesting. As you said it would be a sequence learner, not problem solver but you could see how much you can push HTM in terms of what you want through trying it. However you are expecting the HTM to be a parser basically.


#9

Thanks Sunguralikaan for your answers. In question #3, I meant that I can train on a sequence, but if then later want the memory to play back the sequence, I assume that I have to provide a few inputs to get it going.
Suppose it learns a sequence of 5 prime numbers.
I then test it out by supplying 4 primes, and I want it to produce the fifth.
I can turn learning off, and supply one prime after another until I’ve supplied 4 primes, but what happens next? I can’t supply the fifth prime (because I’m testing whether it can be predicted). So should I just let the system make its own internal transition, supplying no inputs at all?


#10

Every HTM iteration consists of a Spatial Pooler phase and Temporal Memory phase. Let’s assume you supply 4 numbers one after another which are 1, 3, 5, 7. If the HTM learned the sequence before, at the end of each iteration (each number) it would predict the next number represented by the depolarized/predicted cells of Temporal Memory phase. So when you supply 1, the corresponding columns are activated on SP and among these columns the appropriate cell are selected by TM and according to these resulting cells, some cells become predictive. If the HTM knows the sequence, these predictive cells should contain the representation of 3 that comes after 1. The columns belonging to the predictive cells would represent the input 3 assuming the HTM knows only this sequence. At the end of the iteration belonging to the 4th number (which is 7) if the HTM knows a fifth number (let’s say 11) the cells that became predictive on Temporal Memory phase would represent 11 -which comes after 7 which came after 5 which came after 3 which came after 1-. So the predictive cells represent a specific 11 that strictly comes after the said numbers in that exact order. The columns belonging to those predictive cells would represent the input 11 without any context, again, if the HTM knows only this sequence.

So, yes you look at the predictive cells and their belonging columns which would represent the next input.


#11

Suppose I want to predict not only one number, but two in sequence. '
For instance, in the example “1,3,5,7”, I might supply just 1 and 3, and then I would want the temporal memory to predict the other two.
I think the algorithm would be this.
a) After 1 and 3 have been presented, I look at the predicted cells for the next iteration. Every column they belong to is now considered active, and no other columns are considered active. So I could use this set of columns to figure out what number is being represented. (I came up with a method that works fairly well that I posted about earlier that uses both knowledge of which columns are active, and where their proximal synapses come from, to figure out what number they represent).
b) Then I just find the next set of predicted cells, for the next iteration, and that should give me the columns that when decoded produce ‘7’.

If that is wrong, let me know.
Thanks


#12

What you describe is widely used to “reconstruct” the rest of the sequence from a partial sequence among the community and it works for some people. However, it has its own pitfalls when you consider that HTM knows multiple sequences and you cannot really know what it knows and what it doesn’t at any time. Assume the HTM knows 2 sequences; 1-3-5-7 and 1-3-4-6. When you supply 1 and 3, the TM predictive cells at the end would represent a union of 4 and 5 representations. What do you do now? Do you activate all the columns for both 4 and 5 which would break the fixed active column sparsity? Do you pick one group of predictive cells which you are sure that represents 5? If so, how? Do you even know all the sequences that the HTM knows to isolate what you want?

If you have answers to these questions or the answers to these questions do not interfere with what you want, then you should be good.

To provide a different perspective, Nupic handles multiple step predictions a bit differently. Rather than feeding the predictions as the actual input on the following step, Nupic learns to predict n steps ahead. If you introduce a delay to the distal input, the current active cells would make connections to not the previous active cells but the active cells before that. The HTM can learn to predict 7 instead of 5 when the input is 3. This is the recommended way of doing multiple step future predictions if you want that.

Happy to help.