Suggestion: Predictive Firing for Pooling Concepts


There are neurons in the hippocampus which respond to concepts. For example, some neurons respond to a particular person’s face and their written name [1]. These two sensory inputs probably cannot be pooled by object recognition or temporal pooling, since they mostly occur separately and share no features.

Pooling such an abstract concept is probably quite complex, but at least some of that pooling can utilize predictive firing. A person’s face and written name have some predictions in common, such as the sound of someone saying that person’s name or other typical context for that person. A better example might be the concept of music. It could recognize music based on the fact that musical tones predict more musical tones, music predicts seeing something which plays music, and so forth.

If cells which represent these attributes predictively fire, some bits in the SDR will remain on as long as the music plays, thereby representing the concept of music. Simply activating the predicted neurons in temporal memory is not sufficient, however. There are some finicky details and possible solutions to consider. TM neurons are sensitive to sequence context, so the neuron which predicts the column changes. At least some of this sensitivity needs to be removed. For example, layer 5 cells could respond to predictive firing in L2/3 in such a way that removes sequence context. In addition, this method of pooling requires predictions of what could happen a second or two in the future, not just the immediate future. After predictively firing, layer 5 cells could learn to predictively fire in response to L2/3 predictive firing and normal firing which came immediately prior. In the hippocampus, there are cells which meet both requirements. Place cells learn to fire predictively [2] and appear to be insensitive to sequence context in most situations.

Pooling using predictive firing allows some flexibility. Since it represents concepts based on what they predict, it retains the advantages of sparse-distributed representation, with predictions substituting for attributes. As long as it can predict whatever the concept predicts, it can pool something into the concept’s representation, so it can deal with novelty if it knows what to predict.


  1. From Neuron to Brain Fifth Edition
  2. Predictive Information Processing in the Septal-Hippocampal Region and its Involvement in Learning and Memory by M J Denham and S L Denham


Once I read that the place cells in hippocampus were actually tied to very subtle and at the same time very complex input combinations, both spatial and temporal. From that point of view, maybe that predictive firing which captures the concepts is not independent of the sequence but the highest abstracted sequence of sequences collapsed into a single spatial thingy which is an SDR.

Smell for example is very important among those cells that underline a theme for an environment. It is somewhat constant or at least changing slowly and can actually identify a place. Some may say this is similar to how a “concept” is represented. Something that is stable among the sequences of sequences at the higher (highest) level of abstraction. So my point is maybe this information should be strictly coming to these layers from other places instead of having a representational equivalent in layers 2/3/5 etc. In other words, maybe concepts are very abstract information that should stay out of the scope of the layers. The “conceptual” information is still obtained from the temporal and spatial representations on the layers but handled on higher level structures. These structures constantly influence the layers in return through layer 1 but there may not be any explicit conceptual information among the layers.

Hope this makes sense somewhat :slight_smile:


Let me know if I misunderstood something.

I’m not sure that place cells do temporal pooling. Since the hippocampus likely transfers novel information to neocortical regions, it doesn’t make sense to represent the world as stable high-level sequences. If it were to represent a path with a stable representation and then reach a new portion of that path where it hasn’t gone before, it wouldn’t be able to tell lower regions where that new portion of the path comes in the sequence of locations.

Temporal pooling isn’t sufficient to learn some concepts. It adds abstraction by learning longer sequences, but some abstract things are brief and come in the form of various sequences. You can look at a cloud for a split second and recognize it as a cloud, even though clouds come in completely different shapes and you saw a very brief sequence.

I agree that stable sensory context like smell is very important for identifying concepts. Sometimes there is no stable sensory context. I’m not sure sensory context could allow recognition of things which occur in lots of different contexts in many different forms. It might also take a while to learn abstract things based on context alone.

In terms of biology, the temporal properties of place cells seem to have purposes besides generating place fields. For example, sequence replay requires tracking sequences. Similarly, phase precession (where each cell fires at a phase of the theta cycle to reflect how far the animal has traveled through the place field, causing place cells to fire each cycle in the same order as the place fields were entered) likely requires complex learning because the phase of firing is more dependent on position than time. Delay period sequences are also learned and probably require complex temporal response properties because they are similar after learning a similar delay period.


Keep in mind that I am just speculating based on what I think I know some may sound very wrong. It feels like we give different meanings to temporal pooling.

You are assuming that the change resulting in that new path is not lost on lower levels / treated as noise / not important on higher levels of abstraction. So even if you experience something new it may not necessarily change the representations on higher level. For example, if you ate a different pizza today that you never ate before you are still eating something at the higher level. If you have a negative attitude towards sports and you rode a bike first time in your life, it may or may not impact the higher region representing that attitude when you are in that mindset. Sequences have levels of abstractions as you also know.

Split second is enough for temporal processing. I think temporal pooling happens in a lot of different places just as spatial pooling on different scales. If it takes time for a cell to fire and to stop you end up with a natural temporal pooling in the micro scale. The scale that we talk about may happen between layers, between regions, maybe even inside single a layer.

For your example, looking at a cloud should have a relatively stable representation in our mind. If that wasn’t the case how would we communicate with this phrase in a stable fashion. Even if we were looking at a static image of clouds, a similar stable activation as if we were observing a sequence would have to occur or else I would not realize (lack of a better word) that I was looking at clouds.

To me, if there wasn’t we wouldn’t have concepts. After all, concepts may be much complex versions of sensory contexts. I get your point, all I am saying is that you can obtain a stability if you look deep/complex enough and if you collapse time into space (temporal pooling).

I was talking about higher levels of regions generally. The example of smell was just about hippocampus. On the other hand, @Bitking had some interest in specifically hippocampus. Maybe he would inject some knowledge here.

Will end with a question that bugged my mind for a long time.

Is it possible that we have a stable or slowly changing background “noise” (faint activations) in our brain? If so, would that be interpreted as the highest level of abstraction?


Thank you so much for all your feedback. It’s very helpful. My speculation is no more valid than yours, so if I disagree about something, I’m sorry if it sounds like I think your idea is stupid. I’m just trying to explain my speculation and possibly show what I’m not understanding. Sorry about this long post. The bottom third isn’t too important, so you might want to skip it.

Maybe our definitions of temporal pooling differ in the length of sequences which it needs to pool. I guess I’m uncomfortable with the idea that sequences of sequences totaling a second or less can be abstract enough. Maybe it’s because I don’t understand the new sensorimotor work very well. I’m also uncomfortable with applying the current ideas to abstract things because abstract things are often very disconnected from sensory input and some abstract things can manifest as various sequences. To be clear, I don’t think predictive firing can replace temporal pooling.

Equating duration with abstraction is a big error in my thinking. Thanks.
I’m still not sure how temporal pooling could recognize different clouds as the same type of thing. They come in various forms, so they come in various sequences. Those sequences can have some similarities because of context, but that context isn’t reliable. The sky isn’t always blue, for example.[quote=“sunguralikaan, post:4, topic:1964”]
For your example, looking at a cloud should have a relatively stable representation in our mind. If that wasn’t the case how would we communicate with this phrase in a stable fashion.
I agree that it needs a stable representation. The issue is making sure that some bits of the stable SDR are similar for all types of clouds (while the other bits can represent specifics, like the cloud’s shape). Predictive firing can create a stable representation if its details are right. For example, if you look at a cloud, you can predict seeing the blue sky if you move your eyes and/or attention. It’s not that you will definitely see the blue sky, it’s just that you could see it in the near future. By predictively firing, there is a stable representation since it’s always a possibility as long as you look at the cloud. Similarly, you can predict that the cloud might dissipate. There are various predictions which continue as long as you see the cloud.

It’s just that I don’t trust context to be reliable. It can be unstable, nonexistent, inconsistent, subtle, out of the scope of attention, or the same for multiple concepts.
Predictions and context are pretty similar. Predictive firing just adds more context, including context which isn’t always there because you could still see it and that’s good enough to make a prediction.

Predictive firing can also do some other functions which are part of the theory, although probably not as well. For example, it can create stability between actions by firing both before and after the action. It can create allocentric representation by firing before you see the object from a different relative location. It’s also already part of the (possibly tentative) theory for motor control. As I understand it, layer 5 predictively fires to generate motor output. Layer 5 has an apical dendrite in L2/3, so it would be easy to remove sequence context.

I’m not sure. When I’ve looked for persistent firing, I couldn’t find any mention of cells firing for more than maybe a minute, probably more like 20 seconds. Maybe that counts, but I’m not sure if that resulted from stable input, such as staying in the same place field. It’s also possible that cells at the highest level of abstraction choose subsets of that abstraction or otherwise do something which prevents persistent firing. For example, grid cells in the entorhinal cortex firing at place fields arranged in a triangular grid. They do this everywhere and have the same relative spacing between each cell’s grid everywhere. I guess that’s pretty abstract, and it’s definitely not very specific, but they don’t fire for huge durations because of their discontinuous fields. High abstraction might not mean persistent firing, either. It might be more about degree of separation from the input. A cell could fire for a very long time in response to lines if you look at the same thing for a while, whereas a cell which responds to dogs could fire very briefly if you just glance at a dog.

I’m not exactly sure what you mean. I think we might think about the hippocampus differently or I didn’t explain my thinking very well. The rest of what I write here might not be too important to read, since all I’m arguing is that place cells don’t do temporal pooling.

Here’s the model I have in my head: each level of the hierarchy does temporal pooling to learn sequences of sequences. When a sequence is novel, the next level of the hierarchy can’t pool it, so the novel sequence gets passed up the hierarchy until it reaches a region which recognizes it. The hippocampus is at the top of the hierarchy, where it stores completely novel sequences (or completely novel continuations of known sequences) even if it sees it just once, if it’s an important sequence to remember. During sleep and waking states of inactivity, it replays sequences to transfer them to the neocortex, so the neocortex can learn sequences after the brain sees that sequence just once. It’s important to do this rapid learning for paths when you want to take the exact same path again, for example.

The problem I see with this is that novelty exists on a lower time scale than the sequence of sequences high in the hierarchy. If the hippocampus were to receive a stable representation of a known sequence then a representation of a novel sequence, it would store this sequence: a stable representation of the known sequence followed by each SDR of the novel sequence. To teach the neocortex that the known sequence is actually followed by each SDR of the novel sequence, it would have to replay the known portion of the sequence and then play the novel portion. So the brain would have to unfold the known portion of the sequence, since the hippocampus stored it as a single SDR. The hippocampus would also have to wait a little while before it moves on to the novel portion of the sequence, to give the unfolded sequence time to play. It’s much easier to store the whole sequence by learning synapses between each cell of the SDR, so it can replay the whole sequence later without unfolding anything. As far as I know, there aren’t any cells in the hippocampus which are active in a whole path (unless that path is pretty small so it fits in a place field). It seems like the hippocampus learns paths as sequences of SDRs rather than pooled sequences. There are some theories how it does this, such as CRISP [1] and something a bit like a modified temporal pooler [2]. CRISP seems to fit biology better, but it ignores learning in CA3 recurrent collaterals, so it can’t explain some things which the other source can explain. I don’t have the expertise to say that confidently, though.