That Howarth’s “overlaps” are modellable using graphs is what I am trying to implement.
What evidence was that?
To summarize:
Evidence that cognition is embodied, is evidence meaning cannot be (universally) abstracted. Embodiment MEANS incapable of abstraction.
Relating an attempted theory for combinatorial syntax to Chomsky’s theories for “built-in” language features, depends on Chomsky’s evidence that abstractions for language structure cannot be learned.
It is not evidence for abstraction per se. It is evidence such categories cannot be learned. It is evidence for a lack of learned abstractions, what LLMs attempt. Not evidence for abstractions.
Chomsky is totally dismissive of large language models.
I don’t have working model. It’s lacking actual working memory for one thing. Bitking also mentioned transformers did not model the subcortex component of the brain.
He wrote tons of things years back in this forum and backed with neuroscience papers.
That is an opinion I believe. You could be totally right in your opinions but I’m personally with Wittgenstein when it comes to philosophy where I think you mentioned he’s skeptical of it or something. It’s potentially full of play on words where one word could mean subtly different things even in the same piece of writing. With each conclusion drawn from the previous if even one of those before it is false the whole thing implodes. That’s why experiments are very useful like what you are doing right now. You have referenced other experiments to support your points and so did others that didn’t share the same opinion as you. Let’s do our best to focus on creating working model and discuss the theories to make them work.
If you had any evidence you could cite I would be interested to see it.
Of course, it’s a big issue.
There was a conversation on an OpenCog discussion forum related to this last year. Ben Goetzel is trying to find an abstract basis for thought. Again. He doesn’t have one. One would be useful. He wants one. So he opened the conversation. Again:
March 18, 2022, 7AM-8:30AM Pacific time: Ben Goertzel leading discussion on semantic primitives
And, separately from that, I was just looking again at that Tom Mitchell talk I cited earlier (evidence for complete mixing of semantic features when combined syntactically) and I noticed Mitchell had this to say about other recent attempts to find semantic primitives using… MEG scans, I think this is:
Mitchell “If you ask me right now, what is my best guess of what are the semantic primitives out of which the neural codes are built, I’d say, I don’t really know, but these features, plus edibility(??) for example, keep recurring.”
Of course, that’s not syntactic/combination primitives (for which he immediately after presents evidence of complete mixing in combination.) But it is interesting as an example how such a search might proceed experimentally.
Distributed power may have its own problems too, one of the lessons I’d like to learn from blockchain powered Bitcoin, is to realize that “trust” is really really expensive.
Chinese miners accounted for more than 71% of the network’s total processing power, or “hash rate”, from September 2019 to April 2020. Easy access to cheap electricity and hardware have long made the country an attractive place for cryptocurrency firms to set up shop.
Mining operations were concentrated in the provinces of Xinjiang, Sichuan, Inner Mongolia and Yunnan, where energy is abundant and cheap, though not necessarily clean.
“Cheep” is said about the power consumption there, but only without accounting environmental cost along.
I think we should never in need of Bitcoin, only if people can trust each others as granted. But the truth is governments spend a lot in state apparatus to make their people “trust” them; and internationally, by threat of force (sustained by national military spendings). To avoid governments at all, Bitcoin style, still the cost of computation power, no small deal anyway. It’s so expensive to have/enforce trust.
Common wealth put into every individuals’ hand, the only thing they can do is racing to consume as much as possible, for one’s own advantage, or it’ll be a disadvantage w.r.t. competency, thus lagging behind the society. Most people can never be simply sufficiently satisfied, with natural drives of being.
The dictator called off Bitcoin mining, because he can, and he cares (about 绿水青山). Communism/Socialism without a dictator may be even more disastrous.
I’m not up to date with blockchain tech but is it really a given that trustworthy network must be power hungry?
The core idea for why I think a socialist approach is needed in the advent of AGI is that power tends to consolidate. Technologies give governments immense power and data on its populations. As we’re relegated to pet status we become a resource sink for those in power.
With regards to competency, I think AGI’s going to wipe the floor with our competency. And over consumption could be limited in so many different ways and for one thing it’s automatically limited by the monthly universal income one receives. If one fears there will be overpopulation due to too much leisure time for procreation then introduce disincentives the more child one has. If one fears the population will develop some sort of twisted unsightly psyche from overindulgence then introduce mandatory national service requirements that trains them mentally and physically. New tech could mean new paradigm.
You alluded to the idea that a society without a central figure/leader could be unstable but what if the people slowly develop more and more trust towards AI’s competency over humans? Of course, the idea that all AI recommendations must go through an elected committee of technocrats could also be helpful in instilling confidence in the system. Those elected professionals must be vetted very strictly like how a jury is chosen in a trial by the system to prevent the formation of political swamps filled with career politicians and their kin. No second terms of office or political consulting work post-office, etc. The AI could possess strong legitimacy to advise and steer society towards the path of optimal GDP growth and Happiness Index. Any political system is subject to failure so in the event of catastrophic financial failure, war, etc. then no guarantee the population won’t get fickle and distrust the AI and a new dictator emerges but hard to see it happen unless the AI’s quality is suspect or unaligned. But I mean if it’s the best alternative then why not go for it.
I’m not closely following it too, but as I heard, it’s once computation power hungry (Bitcoin), then disk capacity hungry (Chia). Anyway the gist is to back your proposition with cost, that for others unable to override yours with less resource consumption. Costless will render its mechanism ineffective.
I think it’ll be a fact soon, whether trusted or not. People learned to coop with AI will gain advantages over those did not, trusting AI or not will matter little. People will design and actually fire a “kill switch” to stop production AI at all? I don’t think so, then AI get absolute power over human, be coz people need its products to live on, but it doesn’t need people’s protection to continue its existence, no (rational) one is going to really harm it anyway (you can not really hurt the AI by smacking hardwares). Those who committed real harm to AI’s data / software, will be put into prison I guess, as national criminals.
If that’s really the case then blockchain’s future is bleak but the hype is still alive somehow. There’s the so called trilemma of tradeoffs in terms of scalability/speed, security and decentralization. I heard many blockchain projects in passing trying to solve the very concerns you mentioned. Don’t know what becomes of them now.
I was thinking more of a softer more flexible approach only made possible by blockchain tech. But temporary kill switch works too if bounded.
I just watched the video. Quite interesting. In the Adjective-Noun scanning experiment it makes one wonder of the following cases:
small apple vs small whale (the size based adjective has same semantic meaning but the noun itself represents different sizes)
tasty garlic vs tasty cake (similar to above in that adjective has same(?) semantic meaning but the noun itself invokes different taste sensors)
difficult test vs difficult person (adjectives have two different meanings)
I wonder which of the three examples above their model would have a harder time predicting what the adjective was based on the brain activity of the noun.
Do you happen to remember where specifically in the video? I thought he would have one already since he’s doing the Hyperion project?
On another note, I came across this paper recently on graph neural network. I don’t quite understand it but I think it tries to relate GNN to systems of coupled oscillators. Not sure if that’s relevant to your experiment but here’s the link: [2202.02296] Graph-Coupled Oscillator Networks
Ha. My hunch is that they would all be completely mixed. There would be no way to say which of the complete mix ups would be the more complete. They would all be completely complete!
But it fits my idea that new meaning can have completely different representation. And new meaning. Creativity. Novelty. In a similar way to the analogy I made of Google low light photography, where matching on context gives you sets which have the “meaning” of sharing context, but can be completely mixed in themselves, with some pixels blown out with noise, and others dark, etc. The “meaning” is not in the similarity of the representations themselves. The meaning is how the elements of the representation relate to something else (I say cause and effect.)
Have one already? No, that’s the point. No-one does. No-one knows what the structures in transformers mean. That’s why we can’t control them for meaning. We’re still at the starting gun on this. (At the starting gun still, and I say the solution is to run in the opposite direction?)
Where specifically in the video? Ha. Poor Ben. It was early in the morning in… Oregon(?), and he’s carrying a load managing all his projects. The presentation was slow.
As I recall he had just read a book by… philosopher David Chalmers “Constructing the World”. And Chalmers argues for some kind of primitives. So that prompted Ben to look at that issue again.
Also, they are just re-formalizing their computational formalization, in Hyperon, as you say, iteration… whatever, 4 or 5: Novamente(?) (not Numenta), OpenCog, now Hyperon, again. So now would be a good time to build in primitives if they existed. Again.
So the first half of the Discussion Forum was (quite a slow?) presentation of the arguments in the book?
What you might like to follow along is some discussion, in which I took part, starting at about minute… 47.15, after Ben asks for “Others’ take on this”.
I think I resolved Chalmers argument to be one that meaning can be constructed. But I challenged that the ability to construct meaning, need necessarily imply it is constructed from primitives. A point that might be of interest to @jacobeverist .
And I tried to push the argument to be one that the only kind of “primitive” we will be able to find will be a process - of ad-hoc generalization.
Ben was actually not totally antagonistic to this idea. In fact he invited me back to do a presentation on it at a later Discussion Forum.
It’s interesting the way he summarized the problem. Because Ben, like me, thinks some kind of chaos will be at the core of cognition. He wrote a book, Chaotic Logic, around 199x talking about the importance of chaos for it. So he’s sympathetic to chaos. He sees it. It’s just that he thinks the chaos needs to occur on top of a more structured representation. His background is in maths, and he’s inclined to think about the problem mathematically. He’s spent his life seeking that representation.
His most poignant summary was at minute… 1:24:48:
(Edit: link it again:)
“For f’ing decades, which is ridiculous, it’s been like, OK, I want to explore these chaotic dynamics and emergent strange attractors, but I want to explore them in a very fleshed out system, with a rich representational capability, interacting with a complex world, and then we still haven’t gotten to that system … Of course, an alternative approach could be taken as you’ve been attempting, of … starting with the chaotic dynamics but in a simpler setting. … But I think we have agreed over the decades that to get to human level AGI you need structure emerging from chaos. You need a system with complex chaotic dynamics, you need structured strange attractors there, you need the system’s own pattern recognition to be recognizing the patterns in these structured strange attractors, and then you have that virtuous cycle.”
As he says, “Of course, an alternative approach could be taken as you’ve been attempting, of … starting with the chaotic dynamics but in a simpler setting”. So he sees it.
But Ben has a lot of projects. He’s heavily committed to the detail of trying to implement Hyperon, and others. He doesn’t have the bandwidth to try a complete rethink. This whole discussion forum was something of a holiday from the technical detail of his current projects for him. Last time I communicated with him he said the whole semantic primitives debate had been shelved again for the meantime.
Thanks for the ref. On a quick look at the abstract, I’m discouraged by the initial framing on the problem in the abstract as:
“a novel framework for deep learning on graphs”
This seems to be a trend also in spiking NN work. They seem to view “learning” and “deep networks” as the model, and seek to replicate that. I think that’s back to front. The solution is to back off from the “learning” paradigm in general.
Thanks for the references. Will take a look at them later. I haven’t had time to learn about Hyperon yet but in passing he mentioned about probabilistic hypergraphs that resonated with me. I thought he’d have already figured out how to create those semantic primitives (not exactly like how our brains do it but through some learning mechanism) to work with those graphs. The backlog of things to read is getting very long…
I think “probabilistic hypergraphs” has been there for a while. I don’t know about the Novamente era. But I suspect then too. “Hyper”-graphs, just being graphs with more than one connection for each node? So that will be something Ben has carried through his work on cognition from the beginning.
I recommend skimming through his “Chaotic Logic”, 1994, which you can find in pdf on the Internet (though I don’t know if that is with his approval!) Sets which rearrange themselves… as I recall.
So, he’s been quite consistent down through the years. Broadly “distributed representation” when that still wasn’t popular, and then perhaps putting more weight on the “probabilistic” bit when Bayes dominated in the '00s. But basically sucking everything in to a mathematical formalism which can encompass it all, which is a network.
The latest re-jig of the platform may have more to do with computing efficiencies than any change in conception.
I believe he conceives our current situation as deep learning being old techniques which started working because gamer GPU hardware made them fast enough. So he’s looking for a similar platform efficiency change to make his, equally old(!), more general meaning representation techniques practicable, too.
I liked his work early because I found his Chaotic Logic book. And basically agreed with the emphasis he put on sets, and “patterns”, as representations for meaning.
The main difference with me, I would say is what he summarized himself:
So in theory he embraces the “expansion” of chaos. But in practice he’s spent his career looking for compressions and abstractions to act as a basis for that expansion…
For you to judge the consistency of that. (For myself I was the same for a bit. Looking for the representation before I start. I always remember a colleague curing me of that by saying, words to the effect… “In my experience, if something works, it will work with any representation. Don’t worry about the representation.”)