Hi Boris, let me first of all thank you for taking the interest to respond and add your insights and experience, which really enriches such a forum, greatly. So please do not take any comments as critical, it is, as you mentioned, not easy to always communicate at the same level of abstraction. And some points here do enter the terrain of scientific philosophy. Let me start with the point on Kurt Gödel’s “Incompleteness Theorem”. This proof just reveals one important aspect about the reality of this entire universe. The fact that some “truths” about the universe cannot be mathematically proven, which also means they cannot be mathematically derived from any input data, has some strong implications. Not even if you had perfect information of the entire universe. This is a very profound discovery that seems to stand up to all mathematical scrutiny. It is almost as profound as “the big bang” to cosmology or “quantum mechanics” to physics. The implication here is some “real” things (like some real processes that take place) are not learnable, no matter how clever you are. This theorem of incompleteness is also strongly supported by Stephen Wolfram’s computational analysis and specifically his research resulting in the “Theory of Computational Irreducibility” which is basically a computational proof of Gödel’s theorem, for specific selected functions. And Wolfram’s complete work results in the “Theory of Computational Equivalence” which then concludes that our universe is full of many very simple functions (algorithms) which very surprisingly (for mathematicians) very quickly reach maximum complexity in their output, (this means, unpredictability by any mathematics). So both Gödel and Wolfram are revealing that we exist in a universe that is much less mathematically computable (even in theory) than any one had imagined in the early part of the 20th century and even at the end of the 20th century.
Therefore, this has implications for our definition of “intelligence” and how we can measure it. “Predictive Power” is still very important and it is possible, in many areas of reality (Stephen Wolfram calls these pockets of predictability, which could be big also). But the vast majority of computational space seems to clearly fall in the non computable arena, meaning not compressible by a single step. This means that many things can only be learned by doing every single step and memorizing or recording every single step. Computational equivalence (or irreducibility) means that you can only know the 3 trillionth digit in Pi by calculating every single digit one by one without skipping a single one. And the number of patterns in computational space that are simple and yet unpredictable is amazingly large. (Pi is no exception, it is a very common phenomena). So we have to let go of the idea that absolute intelligence can be measured only by predictive power. We have to start thinking more in the direction that any entity which we consider intelligent is trying to predict outcomes for a purpose, like survival. But in a universe in which most of reality is not predictable (not deterministic) we need to then add functions to make approximations and cope with uncertainties. A solid definition of intelligence must contain some metric of its ability to cope with uncertainty in a mathematically unpredictable universe. This “coping with unavoidable uncertainties” is what I call the purpose, like survival or like finding a path to a place or like getting to Mars alive. That is how I arrived yesterday to my conclusion (thanks to your comments) that “In the absolute absence of any given purpose (i.e. goals) intelligence does not have any meaning (any significance)”. The scaling of intelligence can only be specific relative to a purpose, i.e. how successfully it copes with uncertainty to fulfill a purpose.
Please bear in mind, this is a philosophical implication in the context of the absolute universe. Our pocket of reality within the universe (also our computational universe) has plenty of yet unexploited pockets of predictability. There is still a lot yet to be achieved, even if reality has set many limitations to our ideal space of computability. The field of quantum mechanics is also full of such examples of limitations. Which is why it resorts to so much statistics to make probabilistic predictions. It can only predict probabilities, not specific outcomes. But we learned to cope with that for our purposes.
Indeed this is very heavy stuff to digest, so do not feel compelled to comment, though would love any further insights on this if you have them, from your experience and research context. From an AI Engineering point of view, this reality may be a nuisance, but does not give us reason to stop searching for better paths to higher intelligence. And like Jeff states, we have our own human intelligence as a clear example of what this universe has made possible. And if we define “human intelligence” as our goal and purpose, then we have a set of metrics we could derive in order to measure our progress toward human level AGI.
Regarding:
This sounds like you have a lot of experience and deep understanding of the types of frameworks that can be applied at different levels. I follow your logic regarding the fact that we have different types of conceivable architectures (with lower level clusters vs. lower level elements). I guess to some degree, one can also say that the concept of elements can emerge from clusters. I personally am quite certain that hierarchical structures play a very important role in our brains. One open question is whether hierarchical processes coexist at multiple domains, perhaps even with overlapping. My intuition from lots of my reading on this topic in the context of neuroscience is that evolution has been able to produce multiple levels of hierarchical organization in our brain and CNS. Within the Numenta and HTM communities we already have several examples in the models. The HTM level and the TBT levels are basically nesting HTM within TBT and the TBT reference frames allow for very many overlapping formations. The question I would like to explore is how set-theory is built into the TBT paradigm and may allow for many levels of logical nesting. Do you have any insights in this area?