TLDR;
- is there an overall ideal way to understand information (creates the best predictions in the long run)
- is it something we can even know?
- if evolution was not constrained by the physical world, how would the ideal brain (in mathematical abstract space) be arranged (algorithm and data structures)?
I’ve been following HTM for a long while, as a matter of fact, it was my first exposure to the world of machine learning and intelligence.
I’ve been wondering lately about how much of the HTM system (data structure + algorithm of how the brain processes data) approximates the ideal way to manage data and how much of it is an expression of physical constraints.
In other words, I’m considering the idea that there are two kinds of constraints evolution is contending with:
- informational constraints (ie. this cannot be computed until we know the answer to that) and
- biological, or physical constraints (ie. neurons can only reach this far away, etc.)
If all the physical constraints were blown away, how much would the theory, or HTM data structure and algorithm change?
It seems to me that data is inherently temporal. (Not that all data is time-series data, but that no data can be seen outside of time, so times-series is actually the norm in a world that is changing, which is any world.) The only hook evolution has to claw itself up the intelligence mountain with is prediction, and prediction is only available to something existing in time.
Data Science and machine learning don’t seem to excel at time-series data and I think part of the reason is that it’s seen as a subset of types of data, not a necessary requirement to the very idea of “information” in the first place.
So, intelligence to be necessarily informed and formed in time, we begin to wonder, is there a universally best way to understand data throughout time? Now, perhaps best isn’t the right word.
They say in the world of finance, “nobody can beat the market consistently.” I don’t know if that’s true, but that’s kind of the idea I’m coming from. Is there an overall, most efficient, safest way to build intelligent systems?
Is the ideal intelligent system, in other words, an algorithm itself? Is there a meta-intelligence algorithm that defines exactly how to encode new memory (depending on what it is) given any series of historical observations?
Can we even know if the answer to this question is yes? It seems like it much be, ‘yes,’ it seems like it must be provable that there is in fact, a peak to the mountain or a global minimum. I suppose the real question, the useful question is, can we know it when we’ve found the peak, or can we only know if it’s local?
If there is a “best way” to understand (to see it in its own light) data, given data, what would it be? What are the kinds of predictions it would produce?
Jeff has talked about using the knowledge gained by reverse engineering the brain to extracting the ‘Information processing principles.’ We seem to have a vague outline of what some of those principles are, but perhaps they can be refined or even wholly deduced from first principles exemplified through thought experiments.
One principle in HTM is “semantic representation.” Everything needs to be encoded in memory semantically. But why? Because the computational structure is the memory structure, therefore different areas of the memory structure need to understand one another, they need to seek a common, evolving language. Each region, at some level, acts as a translation element from all the regions it hears from to all the regions it talks to. That translation is what computation is.
It seems to me that we can deduce that intelligent systems are computational-memory structures. And that is to say, they’re “networks.” They’re comprised of nodes of memory that change each other.
In other words, principles that inform optimized distributed computation are the same principles that comprise intelligent systems. But that’s just one example of what I’m talking about.
Anyway, maybe I got off track there, but the main question is - given a historic set of observations, (and observations made of a series of symbols), is there an optimal way to predict future observations? If so, what are the principles that determine what that optimal way is?