Artificial intelligence solutions Improves Capital Efficiency. Intelligent machines improve capital efficiency by decreasing plant downtime and improving output. Robots are ideal for doing repetitive or hazardous activities. The amount of data generated by humans and machines today greatly outpaces humans’ ability to consume, comprehend, and make complicated decisions based on that data. Artificial intelligence underpins all computer learning and represents the future of all sophisticated decision making.
Before you flag this as the obvious spam that it is, I am allowing this to stay as a trigger for discussion.
Please consider this in light of the end applications of where HTM might fit in this brave new AI landscape.
What does HTM/TBT bring to the table?
Do we do anything other than anomaly detection?
I will point out that in Jeff’s latest book he also touches on the future of the technology Numenta is working on.
In my opinion, HTM deserves the innovation and research wave that ANN has been going through at least half of it would be worthy of a try. Why? Because HTM is simple and highly scalable from a software engineering perspective. It is scalable because it forces one to stop overparameterizing it, it has a more obvious limitation with respect to learning, therefore innovation can be put in how to denominate and scale it. I’m thinking of thousands of trained SP/TBT that run like normal software in a distributed manner talking to each other to perform a specific task. Is this practical for DL? I don’t think so.
IMO HTM brings its simple architecture to the table and it deserves more innovation and research.
Sure, but when you have thousands of processes you have to make decisions:
- are they all running the same software and executing the same algorithm
- do they share memory, or does each have its own
- how do they ‘talk to each other’?
These questions are nothing new in the microservices world especially the ones that are massively scalable in the cloud.
I know, that’s why I asked them. So can you answer them?
I came here because of the temporal aspect of HTM. Frankly, way too much effort (or thinking) seems to be on the anomaly detection aspect and I do realize that it is the cash cow for Numenta. It is similar to neuroscience papers looking at critical pathways and NCC components that near the end say “Oh, and our FMRI results suggest that the Cantulla Oblagota influences cognitive functionality in Autistic and Alzheimer’s patients.” That being the reason they got the money for the study.
How complex is one of the so-called 'Thousand Brains"? Is it just one cortical column? If so, why don’t you make a thousand of them using the latest neural hardware and let’s see what it can do.
Several of these “sights” look a lot like anomaly detection.
Providing a simple SDR regressor - same as the SDR classifier but for decoding scalar values - could have been much more attractive: HTM could be shown capable not only to detect anomalies but also to make predictions upon the same values.
Entering the vast playground of RNNs, LSTMs, Reservoirs or even RL, could have been more tempting for ML researchers in general.
The pattern completion in L2/L3 that is the core of TBT combined with sequence in L5/L6 has not really been explored in HTM land. I describe how the pattern part can be used to for distributed pattern recognition here:
This is a logical extension of the TBT work but I have not seen any interest in following up on this.
Alas, the demands of my job and family have kept me from developing this into something useful as standalone code.
Yes it looks very…provoking. A reason ML engineers/scientists feel less tempted by HTM-like theories is simply because they bring in unfamiliar (hypotheses about) neural observations.
Statistical ML and deep learning are abstracting out “unreliable” fleshy stuff and focus mostly on the more comfortable mathematical/engineering perspectives.
No, and you know why if you genuinely know. Can you please read my post? Think deserves research and innovation.
Also in case you don’t know this forum is probably the least way to answer such questions.
I had an idea for how to scale up the HTM, but I buried it in a different thread. I will repost it here:
A key modification to the spatial pooler algorithm is to alter how it handles topology. The spatial pooler algorithm, as described by Numenta, has cells spread out over the input space and each of those cells has synapses to the nearby inputs. This allows the population of cells to cover the whole input space despite the fact that each cell has a small local receptive field.
I modified the spatial pooler to have a single set of cells which are repeated across at every location in the input space. This is the convolution trick borrowed from convolutional neural networks. The primary advantage of convolving a single set of cells across the image (instead of having different cells at each location) is that they will generate comparable output because they use the same synaptic weights. Given identical image patches, the convolved cells will have the same response.
- It works with multi-layer circuits, including the distal basal and apical dendrites of a TM.
- It learns at each area and the learning updates are merged together for fast learning.
- It enables significant run-time optimization and drastically reduces the memory required to store the synapse data.
It would be also interesting to combine this with Capsule Networks principles to HTM:
Among other benefits, capsnets address the “Picasso problem” in image recognition: images that have all the right parts but that are not in the correct spatial relationship (e.g., in a “face”, the positions of the mouth and one eye are switched). For image recognition, capsnets exploit the fact that while viewpoint changes have nonlinear effects at the pixel level, they have linear effects at the part/object level.
So this response won’t change with learning? That sounds like intrerneurons, I think some types of them don’t learn. Also, would that response be inhibitory, computing lateral gradient?
AFAICR HTM topology isn’t well explored or experimented with. If I understood this correctly, I think this may also improve redundancy of learnings in some areas and therefore it can be more resilient with catastrophic forgetting.
The main reason why I am interested about this and other distribution techniques is because I think that generalization is more of an emergent behavior achieved by denominated learning units that prefer some set of inputs, rather than a concrete algorithm that the AI eorld have been chasing to mimic. I also think that lumping billions of parameters (e.g. DL) together that take inputs and configure itself sequentially cannot “generalize” consistently inputs with inverse relationships - some inputs will be forgotten along the way. Whereas distributing simple learning units independently may cause to emerge a lot of overfitted models which can be potentially used as an input of a consensus algorithm.i