Why do members here think DL-based methods can't achieve AGI?

tbf, you could say that about literally every approach…

2 Likes

Fair point

1 Like

One ‘major breakthrough’ is models (of the real world). DL + models might do it, but it’s not really DL any longer.

1 Like

World models, preferably ones with multiple components, are definitely something AI systems need.

1 Like

Sorry for bumping this dormant thread, but I’ve come to a certain conclusion about the section of comment.

The fact that these systems demand to be run on giant clusters of GPUs is the fundamental flaw with the whole system. You and I don’t need to run on GPUs, which points to (or should point to) that the foundational basis on which this is all composed is just the wrong approach, doomed to hit into energy walls while excluding anyone who isn’t made of billions of dollars with a spare nuclear plant to hook up to. (see recent interviews from Altman and Zuckerberg).

The whole approach of using perceptrons (artificial point neurons) as the foundational mathematical function of learning is just wrong… very useful, very powerful, and impressive… but wrong. Approaches that embrace sparsity and randomness, making peace with the patterns as they emerge from the inputs, will win out.

Think different. Think sparse, fuzzy, and compact. Think random. And think analog and mechanical.

1 Like

Nor CPUs. The kinds of biological wetware we run on isn’t currently emulated by any hardware - yet. GPUs are simply the next best sillicon devices that can do a huge number of parallel computation efficiently thus providing order-of-magnitude speedups over other hardware while still being versatile.

I agree - the fact that we’re spending so many resources means that we definitely haven’t found the correct architecture.

However, the problem is that evolution consumed OOMs of resources than we can possibly imagine. So what seems excessive to use right now may actually be closer to the lower bound of energy needed to converge towards intelligence.

We just don’t have another system to relatively judge the cost upon, so any figures that arise have to be trusted. Relying on gut intuition here to claim $10-50B may not be enough isn’t really informative. Its as useful as someone using their crystals and tea leaves to predict something.

Despite popular belief, research in those areas is still booming. However, they just haven’t produced results.

Most of my arguments here can be compressed into it. There have to be some results, and thanks ot DL, the bar for results is really high. Toy problems simply don’t cut it anymore, nor should they - IMO the bar for good research should raise overtime otherwise we’re doing something seriously wrong.

The vast majority of “novel” approaches that have been worked on for the past decade unfortunately fall in that category. If they can actually produce substantial results and are scalable, I can guarantee you, we’re all ears.

2 Likes

I have climbed up onto the fence on this one.

I had an experiment with two different LLM’s to see if they could organise between themselves to solve a problem itteratively. I used two different models so the underlying data would be different to avoid a mutial assumption arrising.

Net conclusion I came to was the limiting issue is more around the persisting of the input buffer (dynamic short term memory) because they quickly revert to base data biasing and try to ogranise a video conference between themselves in a mutial back slapping yeah, good idea, type conversation. This to me shows through the undelying type of bias/data that surfaces.

i.e. you are an AI and going to collaborate with another AI in an itterative manner and you need to persist relevant information and formulate the next question for the other AI.

Some of the models are specifically prevented from a caching type approach (i.e. use if interim storage external to the input buffer) where you let the model define information to be persisted into separate chunks, which severely limits some use types.

Tried to get them to code larger programs itteratively as well.

My conclusion so far was they have most of the logic to do generic tasks with the critical missing part being an effective itterative short term memory construct that is not just feeding in the prior 100 questions. More like the output changes parts of the 100 questions as needed (even it it makes no sense as to how we would read it) and that is then fed back in on the next itteration.

The itterative approach may need to go through thousands of steps, much like a count of the number of signal waves over the cortex between being asked a question and formulating a complete answer. Thats were even the current larger 100K plus size input buffers are still structured wrong, or maybe not with GPT5, etc…

3 Likes