“Prediction” from the first principles


#21

Sorry about that. I didn’t mean to imply anything about your approach, and sorry if I was rude.

This might not be the right place for this discussion so I’ll reply to your points in a spoiler.

Rest of this response

Regions have specific roles but the same circuitry (or at least most of it) is used for everything the neocortex does. The connections between layers are consistent. As far as I know, there’s not a single region with an extra layer completely different from the others. There are definitely specializations, but if at least the same components of the cortical circuitry can be used for every function the cortex serves, then either the cortical circuitry is intelligent or it’s a short step away from intelligence.

Predictions and hierarchy could easily be central to intelligence, but that’s not enough to claim more than an educated guess in my opinion. How does it make novel predictions? Hierarchy might help with that, but it can’t solve everything. How does the brain understand sentences or produce thoughts (in word form, image form, or whatever else, I’m not arguing language is important or unimportant for intelligence)? I don’t see how prediction could do that.

The thinking about hierarchy has changed since. Check out this podcast: https://www.buzzsprout.com/188368/753219-episode-1-research-update-with-jeff-hawkins-part-1
I don’t know if hierarchical perception leading to general AI is still the goal, because there are still aspects of hierarchy, just not as much emphasis on the physical cortical hierarchy.

I’m not arguing that action is important. I just don’t see how perception could lead to intelligence on its own.

From my perspective, theories about AI are very opinionated. I have strong opinions which are weakly supported too. It’s impossible not to have weakly supported opinions in my opinion because so many ideas about AI have failed. When I have a new idea which I’m excited about, I try to remind myself that there’s a 90% chance it will fail. Maybe I shouldn’t try to force that on others.

No, you wouldn’t know what to copy at the start, but you can figure that out. There are ways to get around distracting things which aren’t involved in intelligence. For example, there are unessential neuron classes. That’s known because there are neuron classes unique to some regions which you can be intelligent without. If you look for a given neuron class in more than one region but only find it in a small fraction of the regions, you can be pretty certain you can ignore that cell type. Then, you can solidify that based on theory, either showing that the cell type isn’t required for intelligence or for the cases where you get it wrong, showing that it actually plays some essential role based on the rest of what the circuit does, and it just wasn’t discovered yet in other parts of the cortex.

Another way around the distracting details is to not require anything to be included in the theory until there’s a need to do so. That’s an approach Numenta uses, I’ve read.

There are loads of other approaches to get around the messiness, and other people probably know of other approaches. Grid cells have been big in HTM theory recently, and those we discovered by recording neurons. There are some things in neuroscience which aren’t messy and ambiguous, which can really help get a framework going. Another approach is to just try and figure out the role of a connection, neuron class, layer, or whatnot, without worrying too much about how it carries out that role. For example, let’s say a type of neuron activates a little while after all the others. It also doesn’t fire much unless the animal is behaving. But it isn’t involved in generating behavior because it starts firing a while after the animal starts behaving and it has restricted receptive fields. Based on that, that type of neuron might be involved in processing movement, maybe moving sensory input or maybe to deal with the impacts of behavior on the sensory input.

Approaches like these can be used together to build up a better and better sense of what’s going on over time. That’s also just the neuroscience side. I don’t know much about how to test things in code in ways which go alongside neuroscience-oriented approaches.

I don’t really see the difference. I agree that it’s not really right to call it science, because there aren’t any measurements, but science also involves analyzing and philosophizing, especially for really hard problems.

Subcortical structures like thalamus and basal ganglia are still on the table. Regardless, neocortex is still pretty dang complex. I don’t think it’s super complex in terms of core operations, but things which can be described simply can be complex without that description. Neocortex is also pretty messy, especially when most info is based on tiny isolated slices or anesthetics with massive influences on its activity.

Introspection and neuroscience complement each other. Jeff Hawkins has argued that we can’t get to general AI in the near future except by copying the brain’s core principles of intelligence.

I agree we shouldn’t copy most of what the cortex does, although I’m not sure if you’re talking about general operations or more specific things.

Let me try to illustrate why I think we should copy the cortex with an exaggerated story which might not be relevant to your reasoning. I wouldn’t blame you for skipping my rambling.

I find some coffee poured on the road, and I’ve never seen coffee before but decide I want to make some because that coffee was pretty good but, you know, it was on the road so it could be better. So, I go buy some coffee beans because they smell similar but have no idea what to do with them. Do I just leave the coffee there on the road, or do I take it home to help guide coffee making? There’s all kinds of dirt in it so that would distract from the flavor I’m trying to make, so does that mean I throw out the coffee? No, I should keep it around to see if the recipe is heading in the right direction. If I do that, when I add pepper I know the taste is definitely wrong. If I don’t, I’ll just keep going with the pepper because it seems like the right starting point. Maybe I’ll actually start with putting them in water, a good start. But then I can’t check if the color is right based on the roadside coffee, so I end up with beans in water. It tastes like coffee, but not quite right, so I keep adding all kinds of spices, because that’s how you get subtle flavors, right? I never end up making the right coffee because I never realize you can grind coffee beans. Instead, decades in the future when I’m retired from the coffee development business, I’ve made something pretty tasty, with all the right combinations of spices (pepper plays the central role), but it’s definitely not coffee. Still good though.

That’s not what I was trying to say. By “being produced,” I meant any AI not based on the brain. Intelligence is the goal here. My point was that if it’s not based on the brain, it’s not going to develop easily towards intelligence, so it’s going to be something else.


#22

No problem, I am not Miss Manners either :).

This might not be the right place for this discussion so I’ll reply to your points in a spoiler.

These things are hard to argue about, too much intuition involved. I trust mine, but communicating it takes forever. So, people check the wavelength first :). I will try to reply, but it may take time.

Rest of this response

Regions have specific roles but the same circuitry (or at least most of it) is used for everything the neocortex does. The connections between layers are consistent. As far as I know, there’s not a single region with an extra layer completely different from the others. There are definitely specializations, but if at least the same components of the cortical circuitry can be used for every function the cortex serves, then either the cortical circuitry is intelligent or it’s a short step away from intelligence.

Predictions and hierarchy could easily be central to intelligence, but that’s not enough to claim more than an educated guess in my opinion. How does it make novel predictions? Hierarchy might help with that, but it can’t solve everything. How does the brain understand sentences or produce thoughts (in word form, image form, or whatever else, I’m not arguing language is important or unimportant for intelligence)? I don’t see how prediction could do that.

The thinking about hierarchy has changed since. Check out this podcast: https://www.buzzsprout.com/188368/753219-episode-1-research-update-with-jeff-hawkins-part-1
I don’t know if hierarchical perception leading to general AI is still the goal, because there are still aspects of hierarchy, just not as much emphasis on the physical cortical hierarchy.

I’m not arguing that action is important. I just don’t see how perception could lead to intelligence on its own.

From my perspective, theories about AI are very opinionated. I have strong opinions which are weakly supported too. It’s impossible not to have weakly supported opinions in my opinion because so many ideas about AI have failed. When I have a new idea which I’m excited about, I try to remind myself that there’s a 90% chance it will fail. Maybe I shouldn’t try to force that on others.

No, you wouldn’t know what to copy at the start, but you can figure that out. There are ways to get around distracting things which aren’t involved in intelligence. For example, there are unessential neuron classes. That’s known because there are neuron classes unique to some regions which you can be intelligent without. If you look for a given neuron class in more than one region but only find it in a small fraction of the regions, you can be pretty certain you can ignore that cell type. Then, you can solidify that based on theory, either showing that the cell type isn’t required for intelligence or for the cases where you get it wrong, showing that it actually plays some essential role based on the rest of what the circuit does, and it just wasn’t discovered yet in other parts of the cortex.

Another way around the distracting details is to not require anything to be included in the theory until there’s a need to do so. That’s an approach Numenta uses, I’ve read.

There are loads of other approaches to get around the messiness, and other people probably know of other approaches. Grid cells have been big in HTM theory recently, and those we discovered by recording neurons. There are some things in neuroscience which aren’t messy and ambiguous, which can really help get a framework going. Another approach is to just try and figure out the role of a connection, neuron class, layer, or whatnot, without worrying too much about how it carries out that role. For example, let’s say a type of neuron activates a little while after all the others. It also doesn’t fire much unless the animal is behaving. But it isn’t involved in generating behavior because it starts firing a while after the animal starts behaving and it has restricted receptive fields. Based on that, that type of neuron might be involved in processing movement, maybe moving sensory input or maybe to deal with the impacts of behavior on the sensory input.

Approaches like these can be used together to build up a better and better sense of what’s going on over time. That’s also just the neuroscience side. I don’t know much about how to test things in code in ways which go alongside neuroscience-oriented approaches.

I don’t really see the difference. I agree that it’s not really right to call it science, because there aren’t any measurements, but science also involves analyzing and philosophizing, especially for really hard problems.

Subcortical structures like thalamus and basal ganglia are still on the table. Regardless, neocortex is still pretty dang complex. I don’t think it’s super complex in terms of core operations, but things which can be described simply can be complex without that description. Neocortex is also pretty messy, especially when most info is based on tiny isolated slices or anesthetics with massive influences on its activity.

Introspection and neuroscience complement each other. Jeff Hawkins has argued that we can’t get to general AI in the near future except by copying the brain’s core principles of intelligence.

I agree we shouldn’t copy most of what the cortex does, although I’m not sure if you’re talking about general operations or more specific things.

Let me try to illustrate why I think we should copy the cortex with an exaggerated story which might not be relevant to your reasoning. I wouldn’t blame you for skipping my rambling.

I find some coffee poured on the road, and I’ve never seen coffee before but decide I want to make some because that coffee was pretty good but, you know, it was on the road so it could be better. So, I go buy some coffee beans because they smell similar but have no idea what to do with them. Do I just leave the coffee there on the road, or do I take it home to help guide coffee making? There’s all kinds of dirt in it so that would distract from the flavor I’m trying to make, so does that mean I throw out the coffee? No, I should keep it around to see if the recipe is heading in the right direction. If I do that, when I add pepper I know the taste is definitely wrong. If I don’t, I’ll just keep going with the pepper because it seems like the right starting point. Maybe I’ll actually start with putting them in water, a good start. But then I can’t check if the color is right based on the roadside coffee, so I end up with beans in water. It tastes like coffee, but not quite right, so I keep adding all kinds of spices, because that’s how you get subtle flavors, right? I never end up making the right coffee because I never realize you can grind coffee beans. Instead, decades in the future when I’m retired from the coffee development business, I’ve made something pretty tasty, with all the right combinations of spices (pepper plays the central role), but it’s definitely not coffee. Still good though.

That’s not what I was trying to say. By “being produced,” I meant any AI not based on the brain. Intelligence is the goal here. My point was that if it’s not based on the brain, it’s not going to develop easily towards intelligence, so it’s going to be something else.

[/quote]


#23

I was thinking of cortical area V1:

In my opinion it is the system that starts off only able to detect two angles of motion that becomes “pretty useless for deep analysis”. Having to make adjustments adds another step to the process. It makes more sense to me to start off with the brain’s center-surround receptive field organization.

Are you sure they are similar? Because I haven’t seen any. Do you have a reference?

I’m going by the results of hundreds of great sounding ideas I tried including ones that parallel what you proposed, and did not work as well as expected. From my experience successfully reproducing what is found in neuroscientific literature results in discovering a trick that can be modeled with a small amount of code. For example two frame place avoidance behavior and getting from place to place without bumping into anything unless still learning to walk/run/fly or were startled:

Neuroscience requires answering questions like: where would the brainwaves be represented?

How would you even model this kind of behavior without first modeling waves?


#24

I am familiar with orientation columns, but they only represent that one parameter: orientation.
My comparison derives and encapsulates into patterns multiple parameters at once. That’s far more complex, general, and informative.

In my opinion it is the system that starts off only able to detect two angles of motion that becomes “pretty useless for deep analysis”. Having to make adjustments adds another step to the process. It makes more sense to me to start off with the brain’s center-surround receptive field organization.

Adding another step to inputs that actually deserve the costs is more intelligent than adding the same step to every bit of noise that comes across.

Are you sure they are similar? Because I haven’t seen any. Do you have a reference?

I’m going by the results of hundreds of great sounding ideas I tried including ones that parallel what you proposed, and did not work as well as expected. From my experience successfully reproducing what is found in neuroscientific literature results in discovering a trick that can be modeled with a small amount of code. For example two frame place avoidance behavior and getting from place to place without bumping into anything unless still learning to walk/run/fly or were startled:

Neuroscience requires answering questions like: where would the brainwaves be represented?
How would you even model this kind of behavior without first modeling waves?

It’s fine that you didn’t get very deep into my intro, but then you are not in a position to judge what it is similar to. I think I made it pretty clear that I am not doing neuroscience, and that’s not because I don’t have clue.

I don’t need global synchronization, via waves or otherwise, because my parameters are encapsulated into patterns, rather than distributed across the whole system. This encapsulation means that they can be processed locally and asynchronously, in parallel with patterns of other levels. The brain can’t do it because there is no RAM within neuron, so it must use dedicated physical connections for memory. That’s a huge handicap, and we don’t need to replicate it.


#25

That’s true if you are talking about instincts, but they are a small part of human motivation.
It’s not true about conditioning because there is a value drift, driven by cortical learning.
Which means that cortex can and does swap it’s bosses all the time.
And it’s definitely untrue about value-free curiosity: that boss in an empty chair.


#26

Why did you bring your non-neuroscientific model to a neuroscience forum?


#27

Because my model offers a conceptually better way to achieve the same purpose.
I actually asked Matt if this is a good place for it, and he approved.


#28

I am not attacking here so please don’t take this the wrong way.

You seem very hung up on curiosity as if it is somehow special. I take this to mean that you place this as some sort of different behavior from - oh say - seeking shelter or a food source.

In my own case I have a very solid “big picture” idea of how it all works as a system and I am trying to learn how the parts I don’t understand operate to fill in this picture; mine goes from a helpless infant to a functioning adult. I have been working on this since the early 1980’s and there are still many loose ends.

Do you have a complete working model/framework that you think can be elaborated into a working AGI even if it is not documented in your writings? Said differently - Does your partial model fit into a bigger picture or is just an interesting sub-problem to be solved in any way you can work up?
Does it account for saccades and assembling those snapshots into a mental model?
Does this model include being able to drive a body and generate speech?
Does it account for the sub-cortical coloring of perceptions from experience to form judgments?
Does it account for the known observations of various defects of the human brain and the affects it has on expressed behavior? This is important as these defects form the fence of what a “broken” AGI would look like.

I see that these things are not random questions but instead - paths to understanding how an AGI will have to function to be compatible with human culture. I have said this before but I will raise it again: As a researcher in the AGI field, I spend considerable time thinking about the various mental defects and wonder if I would consider it a win to create a fully functional profoundly autistic AGI. Or a fully functional psychotic one.


#29

What Jeff said in On Intelligence about hierarchy is not exactly how we think about hierarchy anymore. We think objects are being recognized at all levels, not just being composed at higher levels. This is different from the standard hierarchy model. Lots of discussion about this at Follow-up question to Podcast 1 with Jeff (location, orientation, and attention).


#30

@bkaz sorry, I don’t really want to read all your posts. Have you a MNIST benchmark or something like that?


#31

I just moved this from #htm-theory:tangential-theories (it really isn’t) into #other-topics to avoid further confusion.

Then don’t read them.


#32

Since my spatial navigation model predicts the loss of ability to generate cognitive maps I had to search Google Scholar for more information, and found this:

Hippocampal lesions disrupt navigation based on cognitive maps but not heading vectors.

Pearce JM1, Roberts AD, Good M.

Abstract

Animals can find a hidden goal in several ways. They might use a cognitive map that encodes information about the geometric relationship between the goal and two or more landmarks. Alternatively, they might use a heading vector that specifies the direction and distance of the goal from a single landmark. Rats with damage to the hippocampus have difficulty in finding a hidden goal. Here we determine which of the above strategies is affected by such damage. Rats were required to swim in a water maze to a submerged platform, which was always at the same distance and direction from a landmark. The platform and landmark remained in the same place for the four trials of each session, but they were moved to a new position at the start of a session. Rats with damage to the hippocampus found the platform more efficiently than did normal rats in the first trial of a session but, in contrast to normal rats, their performance did not improve during a session. Our results indicate that hippocampally damaged rats are able to navigate by means of heading vectors but not cognitive maps.

https://www.nature.com/articles/297681a0

There would still be the very useful heading vector that existed before adding feedback from the spatial mapping network. But as in the paper there would be no improvement in its navigational success rate.

I was beginning to worry about patient HM having possibly squashed my model. But it’s still OK!


#33

Sorry, these are not my concerns. I am designing an algorithm to do science, not a social media chatbot.


#34

You seem to be implying your work is more important than others’ work on this forum. That is not how we operate here. Please be nice.


#35

Now that I have some idea what it is you are trying to do, I can address your work more closely.

You have made it clear that you are not trying to make a part of a larger system but instead – to develop a particular method of analysis of a stream of information and through this method – predict trends.

In your writing you devote a section (Comparison to Artificial and Biological Neural Networks) comparing your work to several technologies that you have identified as somehow similar to what you are trying to do.

This cherry picked list has biological and artificial ANN and CNN networks and then some casual and arbitrary dismissals based on your personal criteria. These criteria include personal taste and invocation of comparison of costs in either resources or algorithmic complexity.

I cannot speak to the personal taste portion as that is entirely up to you as to what is acceptable.

I can speak to the computational costs – it is now commonplace to have access to a GPU on every computing platform – even the $5 raspberry pi has one. These platforms are all capable of running common graphics kernels at video speed.

Your choice of strawman to compare your technique to judge relative merits is odd – most are dimensional mapping algorithms – they are not meant to do temporal predictions. I am not surprised that they do not compare very well.

It is odd that you don’t include RNN and HTM technology at these ARE meant to do temporal prediction. If you do get around to doing this comparison I would avoid invoking computational complexity until you finish fleshing out your level 3 & 4 methods as these are likely to end up with much the same complexity to do anything useful.

As to your level 1 & 2 – you should be looking more at other technologies that are a better match.

The basic operation you are doing a weird mélange of basic arithmetic and logic that end up retracing the steps of basic operations that are usually used in edge detection operations in graphics:

Some edge detection kernels: Sobel, prewitt, laplacian, canny, Gaussian Blur, …

Your insight that these can be used on a temporal stream is not particularly novel:

Evaluation of Kernel Methods for Change Detection and Segmentation

https://www.researchgate.net/publication/222102464_Unsupervised_Change_Detection_by_Kernel_Clustering

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.1469&rep=rep1&type=pdf

I wish you luck in your continuing efforts but I don’t see anything of use to me at this time.

If you manage to come up with something in the higher levels, I will be happy to review your work.


#36

ECCV 2018 - Occlusions, Motion and Depth Boundaries:


#37

Bitking

This cherry picked list has biological and artificial ANN and CNN networks and then some casual and arbitrary dismissals based on your personal criteria. These criteria include personal taste and invocation of comparison of costs in either resources or algorithmic complexity.

I think by " taste" you mean my complaints about lack of theoretical grounding. This is widely acknowledged in ML and there is nothing personal about it.

It’s the main reason I am designing algorithm from the scratch, even if that means repeating some work that’s already been done.

Because the hard part here is theoretical justification, which no one seems to care about.

I can speak to the computational costs – it is now commonplace to have access to a GPU on every computing platform – even the $5 raspberry pi has one. These platforms are all capable of running common graphics kernels at video speed.

As I mentioned in the intro, the costs I am concerned about are on higher levels | layers. Because they increase exponentially with elevation, and I want to design a system that can add new levels with experience, indefinitely. On lower levels, my design is significantly more complex and expensive than anything I know, not mathematically but logically. Especially because basic algorithm is strictly sequential, to make it tracktable.

I consider parallelization a separate problem.

And I tried to explain why it is so complex: I need to derive a lot of fine-grained parameters to predictively prune higher-level search, because that’s where the costs get out of control. It’s an upfront investment.

Your choice of strawman to compare your technique to judge relative merits is odd – most are mapping algorithms – they are not meant to do temporal predictions. I am not surprised that they do not compare very well.

It is odd that you don’t include RNN and HTM technology at these ARE meant to do temporal prediction.

I addressed basic operations: weight matrix multiplication, integrate-and-fire, etc.

My main problems are statistical nature of it all, no one-to-one integer-level comparison, distributed vs. localized parameters.

These apply to all NNs, regardless of architecture. And none of that is my personal taste, it’s the very logic of strictly incremental search.

I wish you luck in your continuing efforts but I don’t see anything of use to me at this time.

If you manage to come up with something in the higher levels, I will be happy to review your work.

Thanks for taking your time.


#38

The HTM model is both binary and not statistically oriented. Perhaps you should take the time to learn how it works before grouping it with all other neural networks. You may find a lot to like about this predictive model.

Many of us share your concerns that that is why we are working with the HTM model. One of the oddities is that is it very good at predictions and so-so on the things that “ordinary” neural networks tend to do very well.

If you work through either the HTM school videos or the collections of foundation papers you will see that they are not at all like the classic ANNs that you were comparing in your paper. The theoretical foundations are both mathematically secure and well elucidated.

They have evolved since the original introduction but the essential features remain the same. What has been added is the underpinning of your level 3-4 structure.

You may recognize that the Numenta group is working much the same way as you do - starting from a basic provable premise and extending it slowly. As they have worked with the basic model they have been open to learning that the first ideas they had were not correct and adjusting the way the model is used. I respect this openness to the possibility that the first intuitions were not correct.


#39

I know it’s binary / spiking, and the inputs are OR-ed within a dendrite, but aren’t they summed in the soma?
Being a “cortical” algorithm, how can it not be statistical, given the amount noise in the brain?
And parameters are still widely distributed across the network, a column only represents one type of parameter?
I have a problem with binary inputs too, logically, it should be integer inputs with binary coordinates: first step is digitization within a coordinate…


#40

I would waste both of our times repeating and explaining what is very well documented in the papers.

I will say that it is NOT a spiking model. The summing part is collection of local features and this is much the same thing you do with your local area manipulation in your model. The breakthrough that sparse representation is sufficient to reduce the search space is informed by the brain but is solidly backed by mathematical theory in the “thousands of synapses” paper.

As far as the representation issue - you yourself have stated that some data types must be reduced to be handled efficiently by your model. (reduction to part 0) In the HTM system they follow the same path with front end “encoders” that bring everything into a compatible format.

I really do encourage you to take the time to learn the HTM system. At a bare minimum, your insight could provide valuable feedback on what parts you see at lacking either in function or explanation. You started with the “on Intelligence” book as did many of us - you could at least see where this work has lead to - if for no other reason than to see where we have all gone wrong. :slight_smile: