If identifying the innate properties of universal forms of intelligence is a prerequisite to attain it, then this talk is one we should pay attention to. John Versaeke has a phenomenal ability to explain the principal criteria we should be aware about in this field. One thing becomes very clear: Prediction alone is a very far cry from the AGI goal. But John nevertheless agrees that the goal is attainable.
In my opinion, Vervaeke points out very well, some of the properties we should expect to find in sentient AGI. He provides a very enriching perspective and interpretation of current developments surrounding LLMs as well as several other AI experiments. He provides both a well-founded scientific analysis and a philosophical analysis of these developments.
After around 40 years of reading every seminal publication I can get my hands on, surrounding the topic of intelligence, consciousness, awareness and neuroscience as well as computational modelling of these emergent phenomena and 11 years in this forum, I find it increasingly difficult to listen to some newbies that quickly jump into the over-hyped AI / ML train and purport to be experts in the subject. I have now discovered a new class of âuncanny valleyâ.
This is the new âuncanny valleyâ felt by those more experienced AI thinkers (including neuroscientists and cognitive scientists) that have invested plenty of time pondering about this complex subject, when they listen to the superficial treatment of this subject by presenters with very little experience. I seem to be encountering these âuncanny momentsâ with increasing frequency at work and around friends. This leads me to conclude, that our society is really not ready to understand these developments. That worries me. Many people are inclined to oversee many aspects of this emergent technology and its implications. This leads to exaggerated fears and unfounded doomsday predictions. (I would never criticize a well-founded analysis with negative conclusions). However, we are seeing the classical fear-driven, uninformed reactions of a society that has not prepared itself to comprehend the capabilities, limitations and potential of these technologies.
âA new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.â - Max Planck
âThe greatest obstacle to discovery is not ignorance; it is the illusion of knowledge.â - Daniel J. Boorstin
If you think everyone around you is misled, then perhaps you should stop to consider that perhaps you might be the ignorant one.
Studying a subject for longer doesnât imply more wisdom, it implies only more knowledge. You can memorize every mathematical theoreom out there, but its all useless if you canât appreciate the subtle, sublime intricacies of logic.
I would recommend taking a more rationalist approach in determining your beliefs and adjusting them to new evidence, rather than going on a tangent about âkids these daysâŚâ
Could you argue that the vast majority who write Medium shitposts and frenzied, crazy, mind-blowing AI insights all arise from pure hype, speculation and straight-up misinformation? sure.
But Iâm more worried by an attitude of dismissal towards everything than AI research becoming the next crypto craze. Research is still bound to academic institutions, so I feel weâre safe - for now atleast - that academics would be relatively unaffected as long as their work continues to bear fruits. It could even bring fresh eyes and funding to the field.
I think this knee-jerk reaction is warranted to a degree, but I would remind you that western media operates on hype-cycles, and AI is just a new one. Eventually, theyâll find something else to talk about and gather more eyeballs. But I doubt AI as a topic of discussion would be fading soonâŚ
There are plenty of people who feel the same. In the media hype you mention, half of it is about âwow AI is hereâ but for such strong hype, you need arguments, so the other half is about âwow AI sucksâ. Some of the latter side is just for fun, because AI can be stupid in unexpected funny ways. Some is about the actual technical stuff.
I think the dismissal is more towards AI being intelligent. If you think of it as general intelligence, itâs very scary some of the things it can do, on the AI safety side of things. On the make-a-functioning-AI side of things, some of the things AI can do are very exciting.
I havenât watched that video. I refuse to, and youtube constantly recommends it. I think videos like that are making people afraid of the wrong thing. Paperclip maximizers wonât happen in the near future. (Paperclip maximizers are a basic example of the challenge of aligning AI with our goals / values. A godlike AI whose goal is to make paper clips turns everything into paperclips). Language models are sort of about mimicry (not word-for-word), and theyâre the most successful AI right now.
Some AIs are far better than humans at the game Go. But AI researchers found weaknesses and a novice Go player was able to beat a superhuman Go AI. In Go, you need to surround your opponentâs pieces. If you do a double-sandwich technique, where the AI surrounds your pieces but you surround those, the AI doesnât understand whatâs going wrong.
So the best Go players are beaten by Go AIs which are beaten by novice Go players. So the conclusion this video makes is we really donât understand how AI works (because itâs a black box; Iâm sure we understand the principles of AI learning, just not what it learns). Itâs alien to us, and deeply integrating it into society could be dangerous. Failures wouldnât have a known source.
One thing he talks about in this is how large language models still have trouble with truthfulness. They say things which a person might say, but they donât understand concepts, so thereâs fundamentally a problem with truthfulness which larger models wonât solve. That makes them unsuited for advancing science and technology, but theyâre still really strong for misinformation campaigns.
Of course they still have positive uses, just not the goal of advancing science and technology. In his opinion, and mine too, thatâs the most important thing AI could do.
What Iâm saying is that its every hype cycle - crypto, AI - it doesnât make a difference. Thatâs just how western media operates because it effectively maximizes eyeballs.
Its true to an extent, I side with you here - but you have to also recognize that Bostromâs (and Yudhowskyâs) entire position here is that you donât necessarily need a super smart misaligned AI here to wreak havoc, but rather a sufficiently powerful goal-directed optimizer.
LMs really excited the community because despite their failings, theyâre excellent optimizer. Think about it from a meta perspective for a second - youâre asking a model to consume numbers and predict what comes next with absolutely no grounding, yet its able to understand, analyze and perform complicated tasks through that proxy.
Not every model or system with that task would have those capabilities. Its only because NNs are such powerful optimizers that they make everything else look like a toy. If you watch a lot of YouTube, youâd notice how effectively the algorithm works - how it provides you the best content later on to keep you hanging. Its effectiveness is insane - you could spend hours everyday and not get bored.
So back to paperclips, if LMs really are such good optimizers, why not get it to use the numbers and instead of predicting them, tweak them such that some reward function somewhere measuring paperclips is maximized? how well do you think that would work? Given enough iterations, and maybe bootstrapped from text, donât you think the LM would be fundamentally deceptive?
Yep, there are some failings of the system but I would remind you, that a lot of them are adversarially designed. Another optimizer NN had the sole task of breaking the Go model. The flying dagger position for instance was inspired by adversarial attacks on KataGo.
If you had access to the human brain, its even easier to trick it - as papers have found, NNs are already more robust to vision adversarial attacks than humans. A few minutes of photoshop is enough for that. So comparing apples to oranges here could be very misleading.
Gary Marcus has had historically, some uh⌠very âstrongâ opinions on Deep Learning and is a bit of a joke and a charlatan to an extent.
I donât mind a bit of debate, but Marcusâ claims are so absurd that almost everyone dismisses his outlandish (and somewhat polarizing) statements.
I think some of them have some merit, but the problem is he doesnât really present the arguments and scientific evidence to back them up. Tweets are hardly scientifically rigorous.
Take truthfulness as an example, in the recent TruthfulQA benchmark, turns out it was adversarially designed against GPT3 and other LLMs. A model pretrained on 4Chan outperformed everything by a huge margin.
Testing such abstract, subjective ideas would always lead to leaky and bad benchmarks - where a bit of prompting could easily outperform SOTA by miles simply because they rely on deceiving the model more than anything else.
Time and time again, itâs been proved that LMs do learn abstract concepts. If youâre talking about hallucinations rather than truthfulness, thatâs an alignment problem - the model has literally 0 incentive to be grounded and not hallucinate. Which was why RLHF was such a big deal in alignment. It helped quite a bit - not perfectly by any means - but it goes to show that larger models and alignment, maybe even multimodal models are needed for actual true grounding.
Right now, yes. They can advance science to some degree - specialized models like AlphaFold are hailed as breakthroughs and they work - they predicted Covid-19âs spikeâs mechanism more than half a year ago than conventional labs verified.
But LMs alone canât do that. So whatâs the solution? recursive self-improvement. If we do reach AGI stage, then the model would need to self-improve and be aligned towards the goal of solving scientific problems. Only then would you advance science. and because its a LM, abstract goals like âexploring new conceptsâ would work wonderfully well here.
(If it would be embodied, I suppose thatâs further help. But I see no reason why it canât solve problems and provide experiments for verification to be done by someone IRL)
Basically, in the sense that itâd act in ways we didnât expect or want. I donât think itâd understand what people are thinking though.
How would you disprove that? If a language model makes a mistake, humans make mistakes too, and it might just not be powerful enough.
With language models, it also seems impossible to prove, because it might just be recognizing patterns in words. Perhaps itâs a matter of semantics. Would you consider a sufficiently complex / general pattern in words to be an abstract concept? What if that pattern were associated with e.g. visual information, so itâs like when people see a bird and think âbirdâ?
In many cases, thatâs as good as abstract concepts, but it seems to me like it works fundamentally differently.
The types of mistakes language models make seem quite different from the types of mistakes humans make. That seems to be because of how they work, not their inputs / experiences, scale, etc.
Isnât that what some research paper are ?
If its a very good optimizer, and understanding (anticipating) what people think and behave towards it is crucial to the goal, then there is no reason why it wouldnât optimize towards some specific behaviors precisely to understand and exploit people better.
Mechanistic Interpretability here provides the bulk of answers. Its been demonstrated how even next-word-predictors can learn complex game state of Othello (a board game) despite being supplied only some moves performed by othello players encoded as numbers.
Algorithmically too, they learn novel algorithms which are yet undiscovered and confusing; such as Neural GPU learning a linear algorithm for binary multiplication (for which there is no known algorithm yet) or vanilla NNs even learning to utilize the discrete fourier transform and trig identities to do arithmetic, which is quite insane.
For us, we canât reliably prove LMs learn abstract concepts except just testing it on abstract tasks. Really, the moment you have a way to judge humans, then you have a way to disambiguate LMs.
I feel that given the evidence that they can learn complex structured representations and algorithms, outweighs the evidence against LMs learning abstract concepts. So for now, my beliefs are updated towards that direction.
Not really; its surprising that it took people so long to figure out how LMs incorrectly answer crucial tasks. The common assumption was they donât have commonsense - which is true to an extent, it comes with scale - but apparently, the problem was that LLMs took assumptions that were unexpected by humans due to the ambiguity in language.
You can often ask it for the assumptions, and clarify what it thinks. Eventually, if it makes the error, it would be reasoning ones in a long CoT or arithmetic ones. Which are very human errors too if you think about it.
Perhaps we may be on different pages and maybe thinking of different mistakes they make, but the ones that immediately pop in my mind are very human. Thatâs not a gold standard, but atleast its a nice reference point.
Dear critic of my statement, I am not resentful of the strong criticisms. It just makes me aware of how easy it is to be interpreted in the wrong context. Let me explain, (sorry it took me a year to get back). Do I think everyone around me is misled? Not at all, and much less in forums like this one, which are frequented by people that, may have different views and experiences, but are far more aware about the pros, cons, upsides and downsides of AI technology. My statement was not aimed at any critics of AI who have an informed opinion, even if their judgement differs strongly from mine. I enjoy enriching exchanges, and if we all had identical experiences and judgements, the exchanges of ideas would not be enriching at all. So I understand that the impression my statement left, was not intended to be the way it came across. I thank you for your time invested, which should help me be more careful in the future and apologize for the wrong impression. I kind of, have to laugh at the situation, because if we humans (just one form of natural intelligence) can so easily misunderstand our own language (because so much of our messages are unspoken, and just implied), we are going to have a lot of fun ahead with AIs in the near future.
My intention was right along the lines of Casey, who valiantly came to my defense. I was primarily referring to the snow-balling trends of âend-of-the-worldâ prophets surrounding anything that sounds like AI is involved or research for which they have never invested any time nor interest, simply gets demonized. (I am not including the originators of some of those arguments, who themselves have deep insights). I just happen to know some people who chose to blindly adopt these negative views without having anything that comes close to a balanced overview. And I think this kind of herd behavior is dangerous for any productive public debate.
I do have a test that I came up with (since I am multilingual and use German, most of the time, since I live and work in Germany for 32 years now) and English and Spanish are my native languages. I started a chat with ChatGPT in English. ChatGPT generated a good, elaborate answer regarding quantum mechanics. Then I asked ChatGPT in German to explain how it came up with certain conclusions. ChatGPT seamlessly answered the question, with its typical breakdown of sources and steps in the reasoning. I continued to dig deeper in Spanish. ChatGPT built further explanations, in very precise Spanish language, regarding the concepts under discussion. It was a chilling moment I felt going down my spine, because there could not be a shadow of a doubt, that semantic concepts have to be at play. Keep in mind, that in each of these languages (English, German and Spanish) you have not only very different grammars, word-orders for the same statements. You also have very different word choices for the same concepts. Not only word-choices, even idiomatic expressions, that cannot be literally translated from one natural language into the other. ChatGPT chose the correct expressions, word-orders, grammars, syntax and every semantic nuance was respected. ChatGPT was also referring to what it has said in a different language, so it was the concept, it was referring to. It was a very, very uncanny moment for me.
How in this world, could ChatGPT just be a next word predictor and predict the right words from a semantic point of view in a different language? It is my understanding that LLMs do not build a model of the world. They are a brute-force approach on the one hand, but also use attention in their transformer architectures. And attention forces them to recognize relevance. This kind of cycle (brute-force quantities => attention => relevance) seems to lead to the construction of some sort of semantic meta-layer that exists beyond the individual natural languages. I would like to see this investigated closer.
After all, when we think about this, GPT architectures contain such an amazing number of hidden-layers in its billions and billions of parameters, that it probably could not easily be understood what each of these layers is actually representing. Perhaps some combination of several of these layers, does actually build some sort of semantic model, that may still be very different from our human models of reality, but since it is our language that is the source of this GPT model, our models are partially contained within it. This is just my speculative thought, about what may be taking place.
Many of us in this forum, follow the general idea postulated by Jeff Hawkings that our biological evolution of intelligence, provides the most likely path to success in the quest for AGI. Natural intelligence required millions of years to evolve. But I am quite sure that we have underestimated how many different types of âalien intelligencesâ are feasible, by the use of large scale, massive parallelized computing. I am starting to recognize, more clearly, that this world is already populated by multiple types of natural intelligence, and it should therefore be of no wonder, that we will develop multiple types of AGIs over time. We have been trapped in the fallacy that there are multiple types of narrow AIs but only one singular path towards AGI. AGI may not be a single target. As if all things intelligent must gravitate and evolve into a single AGI. We humans have once more fallen into the anthropocentric trap, thinking we are the center of the universe. We are bright enough to understand that our intelligence can be surpassed. But we still hold on to the intuition that all-things-intelligent must trace our own path.
This may sound like an absolute rejection of our Hawkings premises, regarding the path towards machine intelligence. I would like to disagree with that, if that is your impression. I think Jeff has been very right at many levels. Jeff has often written, in his works, that multiple forms of intelligence are conceivable. And he has clearly limited his predictions to a set of certain features that future AGIs would very likely contain. Sparcity, reinforcement learning, invariability of representations, predictive capabilities, etc. etc. These features are indeed all at play, even if they are arranged in very different ways to the nature of our brain. We just underestimated the viability of other paths, once certain scaling and dimensioning thresholds became feasible. We are not there yet. However, it seems to me, that many paths will lead to Rome, as the legendary saying goes, but we will not only have one Rome as a destination. We will have many target destinations of similar standing. Let us say, many paths lead to Rome, Athens, Constantinople, Paris, Vienna, Stockholm, Madrid, Berlin, Tokyo, New York etc.
exists beyond the individual natural languages
When looking at word orders of languages around the world and several months of experimentation my conclusion was that all languages share a relatively common trait of learning attention first and then and simple temporal concept spans that are themselves sequenced. What order is used is completely irrelevant as long as the recipient knows the expected ordering. That makes the abstraction of the language a given (and sort of irrelevant) to me and itâs the underlying conceptual understanding that is learnt with the caviat that âconceptâ in the learning sense of a LM is in itself just another pattern, which can be far more abstract than humans may be able to comprehend due to our biological constraints.
That said, after several years studying AI, experimenting and reading lots of papers, itâs the only subject where I seem to have started off with âknowledgeâ and progressively realised I know less and less. The only subject where going backwards is progress, so I now know what I donât knowâŚ
When you get two different AI models to talk to each other and they organise a video conference you know itâs just the underlying patterns, nothing elseâŚ
This is true, any dialog, is just a set of underlying patterns. But that statement also applies to an dialog between two humans. Our language is also only a set of underlying patterns.
The question is, what are the properties of these patterns, in human languages as well as in any new forms of AI based language use. Like I tried to convey above earlier, the existence of different forms of intelligence, is now becoming a fact, with the rise of synthetic (man-made and self-evolving) forms of intelligence. To what degree does one form of natural intelligence, like ours, understand causalities and context etc. and to what degree do the new forms understand these causalities and context, is where variance comes into play. And the methods of âunderstandingâ can also be completely different. This does not mean, that only the human version âunderstandsâ. Our pitfall, that we tend to fall into very often, is how do we define terms like âunderstandingâ. If we strictly restrict the definition to the exact neuro-physiology and cognitive processes of humans, and any variances are discarded as not-understanding, then we are being very anthropocentric. And for all we know, even within human populations there may be some variances in how we understand.
Regardless how we define our terms like âunderstandingâ, the new AIs understand the world and reality in novel ways, that may even more accurately recognize causalities and associations than our human cognitive process can. We can create a new term like machine-understanding for their form of cognition, to distinguish it from ours. But this will not change the fact that a form of understanding, which could be better or worse than ours, is taking place.
By the way, I find the link you included on Word-Order of Languages in Wikipedia very interesting. The field of linguistics shows us how much variance already exists between our natural languages. It also shows us, that the basic framework of all languages having and needing a Subject, Verb and Object is universal. So we have some universal foundation and a lot of variance regarding word-orders and other grammatical features.
In the end, we are all information encoders and decoders. As long as we agree on shared conventions, that is all that matters, regarding the communication. The question of how we âunderstandâ the communicated messages, is where things get very interesting and complex.
One last important point, regarding your statement that LLMs only handle patterns. I very fully agree with you on this point, if referring to LLMs like ChatGPT or Copilot or GPT4. They all understand from a linguistic point-of-view and return coherent responses, semantically, that come from our own human pool of curated knowledge, which they have been trained with. So their responses are as if they are just holding up a mirror and reflecting our own knowledge right back at us. We as humanity are basically talking to ourselves when we converse via a prompt with an LLM of this nature. This impresses us as individuals, but should not impress us as much as the collective human population. The LLMs are just giving us back our own answers that we provided in advance to the LLMs. The magic here, is the coherent semantic âunderstandingâ taking place at a linguistic level only.
However, lets be clear, that AI is more than just ML or LLMs. Even at the level of LLMs we can already create RAG - Applications (Retrieval Augmented Generation). This involves the use of a prompt and an LLM in the background to linguistically understand the prompted questions. But the LLM is then connected, via an API and a vector database, to a local repository in which you can place any number of documents that you wish the LLM to base its responses on. So these repositories could contain very specific documents with internal knowledge that the LLM was not trained with. The responses will appear to be coming from someone with inside-knowledge. These RAG setups are already propping up on all LLM provider platforms. It only takes one more small step, to enhance such a RAG setup with some form of updating of the local documents (or knowledge base), and that already gives the application a form of memory and rule-based updating of this memory. It can even get sophisticated enough, that the local knowledge base becomes a model of reality for the application. Then we start to delve into some form of real understanding. We are at the early stages of creating such applications that can update their own models of reality based on their interactions with us and with their environment.