No, an AGI Uber car should feel hurt when it receives bad reviews!
May I ask you these questions?
- If AGI is a function what is its limit?
- What are the components required to achieve AGI?
- How do you verify the AGI’ness of an AGI?
My point here is that we cannot know 100% if an AGI has something to do (or not) with the human mind/brain because AGI doesn’t exist yet, it is still an aspiration and nobody knows how to achieve it deterministically and for the record, there is 0 instance of it, therefore, we cannot make an accurate comparison with any existing ML models as of today (except for using the human mind?).
When AGI is achieved, it could be that it has something to do with the brain or not but likely the former IMO. This is the reason why I’m still on this forum because I believe that the main components required to achieve AGI have something to do with the brain, while the mind may just be an emergent feature. These components may even need to be integrated together for them to work properly, hopefully not, otherwise, Numenta’s objective of choosing and using some of these components and applying them in real life will largely change.
The practical limits on AGI are the hardware: speed, capacity, power usage, etc. The functional limits are set by what we are smart enough to create, before there is an AGI that can create its successor. My guess is that will be something like a mammalian non-primate brain.
The main components currently missing are the algorithms. We just don’t know how to do it. Hardware is likely to be a problem too.
As I’ve said many times, the ability to investigate and construct models of parts of reality, project in time, choose strategies, learn and update based on results. Like a rat or a dog or a raven.
As you hinted, we have absolutely no idea what will be needed so it is very premature to predict what the hardware requirements might be.
It is highly likely that the increased reliability and stability of electronics could result in vast reductions in requirements vs wetware.
I wish. The human brain runs at petaflops (or better) speed on 20W power consumption. Google PALM uses ‘exaflop-scale Cloud TPU v4 Pods’. I can only guess what a power guzzler that might be.
PaLM was a test for their pathways training system. It’s 540B parameters, but its not optimal at all. Refer to the above linked chinchilla scaling laws - Chinchilla, which is 70B has nearly same perplexity to PaLM. And because energy, cost and compute scale linearly, that’s nearly a 7.7x reduction in everything.
Older scaling laws extrapolated with a single factor - parameters. Chinchilla used two, both parameters and data. You’d need more factors to actually map out scaling laws properly - and ignore any advances in the field to make those models more compute efficient.
I definitely agree these LLMs are pretty power hungry - but if we’re able to achieve near-human positive transfer among large models (GATO-style) then that prospective AGI parameter count will dip down to few trillions - which is quite possible to do even right now, if actual goverments collaborate. Or atleast every big lab in the US
Lastly, the carbon footprint of training these LLMs is quite small (atleast with TPUs, which most LLMs are train on) - there’s even a blogpost about it which I’ll link if I find it.
I don’t see any of this as a path to matching a human brain on an energy budget of 20W, do you? That’s before we even talk about ravens on a fraction of that.
Would you be satisfied if it was only a kilowatt?
I don’t have a figure in mind, but it could have a big effect on feasible applications. A typical electric car has 50-80KWH battery and if it runs 5-8 hours that’s 10% which seems tolerable. For a truck it might be a non-event, but for a drone it looks like a non-starter.
I believe reverse engineering the brain is a good path for this because genetic engineerng advancing and we’ll eventually reach the point of being able to build in-vitro wet super computers that run on sugar.
you can convert a dense model easily to a sparse counterpart - that’s an old research direction, and Numenta’s latest snazzy development presents no breakthrough.
I don’t see ravens solving complex tasks such as language and discrete logical reasoning?
It doesn’t matter if its taking 100 KW to run AGI - right now, LMs which are fractions of a fraction of our brain can already outperform human in certain tasks - with the trend being with scale. If you’re spending more energy, but getting a system with superhuman memory and reasoning capabilities I see it as a win.
Right now we don’t have any kind of AGI. We can build special purpose AI for an increasing range of applications, and with the expenditure of enormous amounts of computer power (and energy) we achieve some impressive results. The only scale (that I know of) is more of the same: more computer power, more input data.
My point is that AGI is different. Animals are able to learn tasks that it seems AI based on ANN or similar cannot, and execute them on a very low energy budget. The HTM approach of tracking sequences may be part of the story, but my interest is in the concept of building mental models of reality. Both we and other animals have that ability, and I don’t see it in my reading.
We need something for self-encoding fuzzy symbols and self-forming state machines which optimize their selection of inputs and actions while exploring the world. In the end, sequences are sequences of what? Fuzzy mushy approximate symbols with resulting output (through the various nervous and motor systems). Some are hardcoded via evolution, and others are learned, quickly, through experience and collective exposure. There’s certainly no gradient descent or backward passes… the input and resulting firing patterns ARE the symbols.
Chasing DNNs until they hopefully somehow become intelligent (whatever that means) is a losing game which increasingly is locking out all but the largest players. For that reason, while maybe incomplete, Numenta’s work is at least headed in the right direction… but they do need help and innovation to push things forward, or else they’re liable to get stuck in the DNN mud. So I’m backing up and looking at how sensory neurons encode their information while the brain continuously fills in the rest through constant state of fading backfill.
My mental model is that columns are processing units executing algorithms on state captured as SDRs. The basic algorithms are evolved and inherited, and there a lot of them (thousands?), possibly encoded in (junk) DNA. Childhood is where you connect up the GP algorithms with the world you got born into. Is that also encoded in DNA?
A dog is not born knowing about sheep or balls or walks; those are collections of algorithms applied to specific objects in pursuit of goals. Training works everywhere (muscle memory).
Numenta has identified a plausible data structure and a plausible algorithm for tracking sequences. So what should be next?
Do you know of anything particularly promising? I think sparsity, broadly interpreted, is a big deal. That includes segmented learning, where weight updating happens only in areas / sub-networks / columns that match by vertical context: Numenta’s prediction. This segmentation: tentative classification + possible novel class / anomaly formation, allows for continuous learning. That’s sparse learning, a higher-order sparsity than sparse connections.
There are currently very little tasks where DL doesn’t do well. I’m going to need a more concrete example than just a hand-wave “there exists…”
The scaling hypothesis in DL focuses on obtaining AGI - human performance and above. A basic RL net can outperform almost all animals.
That scaling hypothesis is presented by Numenta too in a more neuroscientific manner - that namely there exists a fundamental architecture (Montbatten et. al? I forget) in our neorcortex that is replicated in the form of cortical columns. Animals have less of those than human, which, as the theory goes, is the reason between the intellectual disparity.
I don’t know the exact specifics of that theory, But its in the TBT book. Anyways, DL does something simpler except the fundamental architecture we scale (the artificial neuron) and other variants of it are mathematically guaranteed to approximate any function in existence. So, the hope is that they implictly learn to find the most efficient circuits to carry out the task - which is where scale steps in. As we scale, these models become more general, sample efficient and powerful which by mapping those scaling laws, allows DL researchers to estimate at what scale AGI will be possible.
There are nearly a thousand papers on it. I’m not that involved in that area so google scholar and Arxiv would be quite helpful.
lookup Gated Mixture-of-experts, Switch transformers (the seminal paper) and technqiues like T-Few and ELMForest.
There are some really cutting edge alignment works which posit that large language models are implicitly trying to become more sparse - somehow related to the polysemanticity of neurons. It’s highly beyond the scope of this forum, and quite experimental/theoretical - but if anyone’s interested, The “Anthropic” papers provide a good start (they do cite a lot and explain very little, so one has to do a lot of backtracking and reading)
Interesting. To be a rat is to be a rat there’s no such thing as outperforming it IMO. But intellectually exceeding a rat based on our understanding of being an intelligent human makes sense. If an AI is intellectually outperforming a rat, then that AI is not intelligent enough to be a rat, it must be trying to be something else like another animal. The ability to stay at a rat’s intellectual boundaries shows intelligence . Sorry to digress.
There is persistent confusion around the distinction between “narrow” AI versus “general” AI.
There is also an ongoing debate over whether research into narrow AIs is useful for creating general AIs.
On an unrelated note: narrow AI’s have gotten unexpectedly good in these past few years. I’m really impressed with the new image super-resolution up-scaling technology. First NVIDIA uses the technology to make real-time ray tracing feasible, and now they’re using it to extrapolate entirely new image frames which allows them to increase the FPS to arbitrary rates.
I’ve yet to see an AI that is able to invent and use tools.
I just had a close encounter with a raven (Corvus coronoides). I was unloading grocery shopping from the car, and taking it inside. I came back to find the raven had got inside the car, opened a bag, found some meat wrapped in plastic and was chewing off a piece. You’re not going to tell me you have an AI that does any of that, are you?
The raven did not evolve to deal with cars, people and plastic but it did evolve enough general intelligence to survive in a changing and complex environment. Your AI doesn’t do that either, does it?
To set a more realistic task, imagine a motor car equipped with your best AI and a credit card. Can it survive (shutting down is not an option)?
I think you have in mind a very narrow set of AI tasks chosen to suit the state of the art, a long way short of what the real world demands of every animal.