Still very far from the distributed representation that is coding in the cortex. You are essentially describing āgrandmother cells.ā
The āsubset of cells in columns representing a letter, to another subset of cells in the columns of that letter.ā bit? You have much finer parsing between regions of processing. I call it micro-parsing, but that is not a real generally recognized term. (It should be!) At the region/map level you have a population code that represents features at that level of representation. The idea that there is letter and word within a given map/region does not match up to anything that I am used to seeing in the biology.
The problem in how this corresponds to your proposals is that the multi-region model with the required interconnection paths would be so large as to be intractable with current technology, so you are making toy models that do not do a very good job explaining the processing in the brain.
And you are not doing anything with the zoo of different types of cells working as an ensemble. The dynamics of different inhibitory cells is pretty hard for most spiking model to represent accurately. I donāt recall seeing a spiking model that does a good job with chandelier cells; key to HTM temporal processing.
Likewise, the interplay between the cortex and thalamus with bursting is not something that I have ever seen with spiking simulators.
The example that springs to mind is using transistors to explain a digital computer; making a full-up digital computer out of discreet transistors may be out of reach to the casual experimenter so you rig a handful up to make an analog computer. You may find that it does have interesting behaviors, such as calculating certain results but it offers no real understanding of how a spreadsheet calculates an answer.
prompt_col_density=0.5, # how many columns per letter to spike
prompt_cel_density=0.5, # how many cells per column to spike
If youāre spiking a fixed number of columns and cells, thatās probably going to squash any sequence code. I think it should probably only spike the cell synapsed to.
I think you might be misinterpreting that. The columns represent the letter. Theyāre an SDR, not a grandmother cell. If the cells represent a path through letters, that is exactly the same as HTM to my understanding. Only less specific (trained) to a single cell in what Iām proposing.
Zoos of different types of cells working as an ensemble, I grant you. Iām thinking entirely in terms of sequence representation elements. Whatever they could be. The equivalent of the CLA in HTM. So perhaps one layer at this point. In particular Iām interested to hear about different inhibitory cells. The ones Iām thinking of at this stage inhibit fairly globally. I think you said thereās a kind which have that property. What other kinds are there?
But the mechanism Iām thinking about might be on the level of the sequence representation mechanism in that paper you cited earlier. Here:
I think they conjectured it as a mechanism for where⦠part of the parietal cortex??
Perhaps Iām misinterpreting your code. To me it looks like you spike half the columns and cells for the whole letter if any cell gets enough input. And Iām thinking only the cell which has the inbound synapses should spike.
Am I understanding that wrong?
I canāt distinguish if you spike only the subset of cells specific to the sequence so far with this. Iām suspecting not.
Gotta shut down now. Late. Will look more tomorrow.
It is a deep look at multiple regions of the brain working together.
Since it attempts to do a realistic description of the basic task of tracking a ball moving through the visual field, it is about as simple a model as you could construct that still engages the key mechanisms of the cortex and thalamus.
No lettters or words, just a big fat ball.
Even with that, this is some tough sledding and beyond the comfort zone for many on this forum. Lots of TLDR action here.
If you do have the background and do the work, there is a wealth of understanding of how the brain does things. The answer to your āpart of the parietal cortex??ā is here; if you want to do the work to understand your answer. I really donāt want to come off as condescending, and I could see someone taking it that way, but a reasonably complete and correct answer to your question is complicated.
I really donāt want to offer a simple answer that is a lie by omission. It that is an OK answer for you then; all of the brain is involved.
If I may add one other possible cause of confusion: the training set (Brown corpus) is used only to form synapses - it does not run the network (no steps taken).
Prompting uses an (implicit lookup)/mapping to find the column group (aka letter) and chooses what to do next at runtime (as 1 above).
# global scaling factor, to accommodate a unit synaptic efficacy value of 1.0
# roughly this specifies that:
# how many presynaptic spikes is enough to trigger a postsynaptic spike,
# when each synapse has a unit efficacy value of 1.0
SYNAP_FACTOR=300,
Iām totally on board with āall of the brain working togetherā. I just donāt think the way the whole brain works together to combine old elements to construct new meaning has been addressed as effectively as other aspects. Thatās why I liked the paper you referenced, because it did address that problem, and particularly the paper referenced within that, specifically looking at combinations of elements to construct new meaning.
So I think this combinations of elements to construct new meaning is just a less addressed area, and basically the missing piece.
It need not invalidate other perspectives. I think embodiment, down to the detail, will be important to substantiate qualia, for instance. Thatās a big thing.
A lot depends on the detail you want to model. To place my ideas in the context of current AI, using the common āmastery of flightā analogy, I would compare deep learning to balloons - actually working, but in a completely different way to birds - transformers maybe as airships (someone described our current time as the āexciting Zeppelin phase of AIā, recently?), a truly biological model as the whole bird. And myself in that, I hope, in the same relation as the Wright Brothers, abstracting as much as is needed of the bird, but without feeling the need to be constrained by feathers and blood vessels, useful perspectives as they too may be.
In that I actually donāt think my level of abstraction differs too much from Jeff Hawkinsā original conception of HTM.
And in detail Iām working very much at a similar level of detail to what I remember of the CLA (Cortical Learning Algorithm.) And the representation of sequences as paths between cells in columns, in particular.
ā2ā seems fine. To combine ā2ā and ā1ā, you might just drive only the prompt cells appropriate to the prompt context.
But for driving prompts youāre right, that does raise other issues.
Firstly it raises the issue of what maximum path length we want to encode.
The maximum path length coded might come down to the number of cells and columns we have. Those will impose a maximum coding depth for sequences. There will be a maximum length of sequence it will be possible to encode by selecting subsets of cells. Especially if the subset is 50% of column cells, as at present(?)
And then we come to the problem of how to drive partially matching sequences. Itās easy if the sequence is an exact match. You just drive the next cell that sequence synapses with. But if the sequence is only a partial match, at the extreme, when youāre at the first litter of an isolated word prompt, what cells do we chose to driveā¦
Ideally we might drive with a strength related to the fit between a fill path and the partial path.
So, ideally, an isolated sequence like āquickā, might be driven on the cells of each letter, with a strength proportional to the length of match of the preceding sequence. So maybe āqā gets a very weak drive on all its cells (because no match with any preceding sequence), and āuā gets a slightly stronger drive on the cells of all sequences ending in āqā, āiā a slightly stronger drive again on cells of all recorded (Brown Corpus) sequences ending in āq-uā, etc.
The same for feedforward synapse activations.
It also raises the issue of what do do when you have a long sequence with just one character different.
It also raises the issue of how to identify the cells for all recorded sequences ending in a shorter sub-sequence.
I guess itās done as a property of an SDR representing the path. Very specific paths, paths reaching back to the maximum distance capable of being encoded with an SDR or a given size, might have larger numbers of cells active (to give the depth of encoding necessary.) While shorter paths might have smaller SDRs.
Perhaps the way to do it is to start with very small SDRs when coding sequence, and gradually expand them. Then when we do the feedforward, the longer paths will be driven on a larger SDR, and so give a stronger, but more specific, signal.
Thatās just an initial idea about how we might address this. Iāll put it out there for comment, and do some more thinking about it myself.
Yeah. So the way Iāve sketched a more nuanced ālookupā above, is as an expanding path SDR. So that longer paths have more cells active. Giving both a stronger, and a more specific, signal.
Re. the corpus being used āonly to form synapsesā and not to ārun the networkā, I donāt follow the objection there. Maybe youāre saying the same thing about how to match a driving/feedforward signal from a shorter sequence. My solution sketched as starting sequences represented as smaller SDRs (instead of 50% of column cells at the moment?), and then expanding them as the recorded sequence grows.
Whatās the difference here. It looks like you stopped learning sequences??
Iām not sufficiently familiar with what SYNAP_FACTOR and fire_dots_alpha are doing at the moment to comment. My impression is that this is just to vary the strength of synapses?
Nice to get some oscillation though. Playing around with it to see what ātuningā gives us the kind of signal we want, is a good way to get an idea what we need.
Itās a raster plot. A plot of cell firing. With time on the x-axis, and cells (grouped into columns, roughly) on the y-axis.
In particular this is a raster plot for the cell firings of a network that @complyue has coded up, which captures all the sequences of letters in a corpus of text (the Brown Corpus in this case.)
He then drives the network by spiking cells in a prompt sequence, and sees how the activation spreads across synapses representing observed sequences of letters in the Brown Corpus.
To start interpreting it, look first at the extreme left. The very first cells to fire will be exactly those of the driving āpromptā.
If youāre talking about the last chart he posted in this thread, you can see on the extreme top left some spikes for āxā. Then, lower down, a bit later, spikes for ādā, āeā, āfā, ācā, āaā, āfā, and āeā, in sequence. Thatās what heās chosen to use to drive, or āpromptā the network, as a test. The other cell firings are the way those initial prompt spikes spread, and then repeated, or oscillated as activation circulated around the sequences of letters observed in the Brown Corpus.
We want to tune the network so that sequences of letters which tend to share beginning and end points in the observed sequences of the Brown, will synchronize, and give us nice vertical lines we can use to identify words, and later phrases.
The questions are about how many of the letter-mapped columns/cells to fire, e.g. āqā has 10 columns each with a 100 cells by default. There is no analysis of what those cells link to - if anything. This also currently does not make any synapses - so you are assuming an existing network built by the separate corpus mapping functions (run before the network started).
The reason you know a āqā group exists is because the whole alphabet (a-z lowercase) is hard-coded into the data structures in advance.
This could be different but that is what exists now.
If you look at the code you can see that the difference is that the new one does not run words together when making links (but the other one did). Both make sequences, but logically these are now shorter (word length not sentence/corpus length).
Thought some more about this. Is the path code currently done by:
Actually I didnāt recall it was done as one cell connections only. I thought there was a place where you coded it as 50% of column cells connected to cells in the next letter.
Looking at it now, I donāt think one cell is enough. I think the path encoding needs to be an SDR.
So when we encode the sequences from the corpus, letās try synapsing a random subset of cells from each letter to the next (or synapses from the cells synapsed in the last sequence step, to a random set in the nextā¦) It needs to be a set. I was wrong to say just code the path as a connection from one cell to another. (That works for HTM because it codes a pattern on a dendrite of the cell?? We need a pattern too, but it can be among the cells?)
Anyway, if we code the sequence path using a subset of cells, or SDR across the cells, the limited size of the subset will mean the code for long sequences will eventually wash out. Because automatically the random choice at each step will start to overwrite remnant code from longer sequences. So the SDR will code (= limit the activation of?) a sequence of some length, but not infinite length.
Then, how to represent shorter strings as subsets of that SDRā¦
As an algorithm, I would say when driving prompt states:
āburstā a prompt letter with no context. So, spike āallā its cells.
Then, for the next letter of the prompt, spike only the cells of the next prompt letter which synapse from the first. Those will automatically be a superset of what the spiking cells would have been for longer paths (because the spiking cells of longer sequences would have been selected from that superset by the paths.)
Spike only those cells from the third prompt letter which synapse from the second (also automatically filtered as those which synapsed from the first.)
I think that automatically implements coding partial matches of shorter sequences with longer sequences in terms of their path cells (though actually the reverse of what I initially suggested, because it codes shorter paths as larger SDRs)
I thought it was driving 50% of each randomly. Note by ādrivingā Iām talking about prompts. Not the way that propagates further into the network.
I canāt figure out what problem you are pointing to here. Iām assuming a network built by corpus mapping functions run before the network started, yes. But when thatās done there are synapses. And those can limit what firesā¦
I guess I am missing the problem you are pointing to.
With this are you objecting to an arbitrary assignment of SDR code to each letter? Those could be built from lower level data, previously we discussed building them from some kind of neuron response to sound. But since the process of interest is that of building the next level of structure above at each stage, it doesnāt matter much where you start, and taking it from a hard coded letter representation is just a place to start.
If thatās what you mean.
Oh, thanks for clarifying that for me. Thatās not such a big deal then.
@robf I realize that I missed one important point in the synapse/connection making algorithm, that the selected cells to connect should be in a single thread of spike-train (am I using this term right?). I think I was wrong in my previous implementation about that point, thus this change to fix it:
With this realization, I now get how the āpathā info is indeed encoded. And based on the new understanding, I drafted a new āblurā based prompting method, that at any time step when there is a letter to prompt:
the voltage of all cellsā except those belong to the prompt-letter get scaled-down by a prompt_blur=0.8
when any cell belonging to the prompt-letter is going to fire, leave all prompted cells as they are, i.e. as driven by the networkās dynamics
when none of the cells belonging to the prompt-letter is going to fire, force all of them to fire
Code location:
For āblurā to be safely performed by simply scaling, now VOLT_REST is hardcoded to be 0.0 and SPIKE_THRES hardcoded to be 1.0.
Iāve also updated the NBs and get sorta game-changing results, please pull the latest commits and run the new NBs to see and tinker.
Yes. āSingle thread of spike-trainā makes sense to me. And similar to the way it is done now in HTM. Except the āthreads of spike-trainā are not trained.
As I was sketching above, I now think that such a single cell train of activation likely needs to be made an SDR ātrainā. Perhaps randomly selected for the first element in a sequence, but from then on synapsing only from that initially selected SDR group, and to a randomly selected group in the next element.
Using an SDR ātrainā rather than a single cell ātrainā, should automatically limit the length of sequence which is coded (though I suppose that would happen with a single cell train too, which would eventually run out of unique cells to synapse to.)
But I can imagine anything which helps move the code to be one of paths rather than isolated states should indeed make a āgame changingā difference
Iām confused that it still seems to be states other than the driven states which have continuing oscillations though. I would have thought the main difference of isolating paths would have been to restrict activation to only those states on the driven path, and other paths with similar beginning and end points.
Donāt see the significance of doing this. It further emphasizes the prompt state activity?