ML and Deep Learning to automatically create AGI?

I disagree even Archimedes “Evrika” moment (which is old Greek for “Aha”) would not have happen through aimless playing in the bathtub, his play was motivated by a focused drive, a wish to understand what makes some things float and others sink.

I didn’t mean to suggest that human-level curiosity couldn’t motivate focused investigation. In fact, that’s kind of my point: simpler editions of curiosity-as-motivation drive play behavior to fill in maps. Significantly more evolved, human-level curiosity motivates qualitatively different behavior, given the vast unfilled map space that “demands” filling.

Edit: typo.

1 Like

So perhaps we can reach some agreement that curiosity is yet another thing that has a continuum?

On one end is a critter mapping its territory, the other end, the terrifying AGI that does investigation to a degree that puts Sherlock Holmes to shame and is the stuff of our AGI nightmares? We hope that it is motivated by “the right reasons” so it works for the good of humanity. Of course - this discussion then moves off to what are the “right” motivations.

BTW: you can put a reason for the edit in the top of the edit box, and you can review the history of edits by clicking on the red edit pencil on a post.

1 Like

Curiosity is a drive to build predictive model of environment. It doesn’t need other drives to work, they only distort or inhibit purely cognitive function. It is modulated by dopamine, especially its prefrontal pathway, but this modulation is tonic, basically representing lack of interference by other drives. Whatever subcortical areas were used for exploration by lizards, their function has been taken over by cortex, thalamus, hippocampus. Ok, also cerebellum, but that’s more like a passive storage.


Hmmm, that “passive storage” stores sequences and mediates conflicting inputs to join them together. It plays those sequences back on command from multiple places in the forebrain and plays them INTO the forebrain. About 1/2 of the number of the cells in the brain are allocated to this structure.

It turns our “chords of thought” (distributed parallel representation) into sequences of thought.

You may want to rethink the importance of this structure to the overall function of us clever humans.

Please see:

1 Like

Ok, nothing in the brain is passive, I meant much more passive than cortex. Basically, much shorter-range search. Hence the number of cells: more memory, but less (re)processing.

Yes, that’s what I meant by distorting purely cognitive function.

I am not sure how one might arrive at a judgment of good or bad purely from a cognitive exercise.

As I see it working - Anatomy of a thought

  • There is some drive from the hypothalamus. As discussed above, this could be any drive, big or small, originating from any one node of the hypothalamic cluster. The voting on which drive is the most important is resolved at the hypothalamic level and is presented to the cortex as the most important thing. We really don’t multitask, but we task switch really fast.
  • This drive is unwrapped in the forebrain. This demand from the subcortical structure is processed much like any sensation in any sensory input and parsed for content and resolved into cortex compatible features.
  • As this is developed into a drive this could end up being a command to the body. The map contents ripple up from the lower forebrain in the general direction of the central sulcus.
  • Part of what is learned by the cortex/cerebellum is what commands go where. Some do go to the body, some are directed inward to the rest of the brain. These patterns are sent as a distributed pattern to the cerebellum.
  • The cerebellum has learned to take these parallel distributed patterns and turn them into sequences. Part of the input to the cerebellum is the destination of the learned sequence. The output sequence could be the body OR various parts of the sensory stream. Note that this output is the deep cortex “motor drive” axons associated with the feedback path.
  • The WHAT/WHERE stream is driven with fragments of previous inputs that unfold into the stored representation to be recalled. This recall is processed through the WHAT/WHERE stream back up to the temporal lobe.
  • In the temporal lobe this “experience” is processed with the same system that processes any other sensation even though it is triggered internally.
  • As this ripples up the WHAT/WHERE stream the two streams (feedforward/feedback) are evaluated for a match with the unfolding need state (that hypothalamus thing again). This can trigger an AH-HA experience if a global workspace is ignited.
  • If there is a match we evaluate its goodness based on the value stored with the memory.
  • If it is not what we are looking for the process is repeated.

This is the core of reflective thoughts.

There may be stuff in the cortex but the cerebellum is the driver of the search engine and the weighted contents (search termination) is the GW ignition based on these recalled contents. Notice that much of this process is both initiated by structure outside the cortex and supervised by the lizard brain.

I have no idea how you think a non-lizard brain engine is supposed to work but I think it would end up having the same limitations that is commonly associated with the current AI projects - no common sense. The weighting that the limbic system adds to every episodic memory gives an automatic cue if something is a good idea - there is no need to try and figure it out in every situation from first principles.


Good: increase in predictive power of the system, = substrate capacity * projected input compression.
Last term is per unit of capacity (memory + processing) and inputs / sources are selected to maximize lossless component of compression. It’s far more abstract than 4Fs, but that doesn’t make it any less real.

How does one know if it increases the predictive power without testing it?
You can predict an infinite number of possibilities.

With emotional weighing you can test the components of a prediction on the fly and get positive or negative emotion as the memories are probed in an interactive sense.

Without this guidance you have no restrictions on the search space, silly outcome are just as valid as anything else. Children wish for impractical things but as their world knowledge increases they tend to pick more logical things from the search space of ideas.

As far as some test that is based on the regularity of the world or how well it compresses - the world is full of logical inconsistencies we take in stride. While spock may say - it does not compute - my brain just says “OK - that is the way it is” and deals with it.

Past experience is already a test, not qualitatively different from future. Prediction is simply a temporal aspect of compression: of future input. Note that I specified lossless component: the amount of original input that can be reconstructed from representation.

I have outlined the general outline of how though progresses in my model.
Do you have a similar big picture description of how thoughts work in your model?

If it helps - say in a social interaction with you in your office.
You are discussion the latest council directive on something like garbage collection vs recycling.

In the brain, its just neocortex, + cortico-cortical intermediates (thalamus, hippocampus, cerebellum), + tonic prefrontal dopamine. Basically a feedforward and feedback flows through cortical hierarchy, nothing else. As for how it should work in a properly designed system, see my intro.
This is about “pure” thought, no emotional intervention, with largely silent limbic system and below.

Sounds like the formula for SKYNET eliminating humanity because “it’s logical” and no guiding value judgement.

This is a purely cognitive component, adding reinforcing values is separate issue.

You get that automatically with built in emotional coloring. Each recalled component is tagged with judgement as distributed as any other aspect of the stored memory; it is all distributed without having to maintain some parallel system.

I am not sure how some parallel evaluation system would stay in synchronization with a purely logical calculating system any other way.

Please note that humans have tried to make formal logical systems for as long as we have recorded history and all have been abysmal failures. Gödel goes as far as saying that it is a fool’s errand, and proves it mathematically.


And that judgement is right? I think it’s a good idea to actually understand what’s going on, before you pass a judgement on it. Even if that’s not how human brain works.

It doesn’t have to be always in sync, some detachment is good for objectivity.

Yes, for the lack of analytic introspection. I think I can do better.

1 Like

I cannot agree more.

Getting a little way back to the question of turning ASI into AGI via a path modeled on HGI, there are some vast differences between AGI and HGI that need to be bridged.

Regarding the human-level curiosity that evolved along @Bitking’s continuum using an increasingly sophisticated Aha!/Evreka/AH-HA signal as neural maps multiplied and became more complex: ASI has no curiosity. Its “drives” are implicit in its programming for health points and treasures and obstacle- and adversary-avoidance, etc. Those drives never shut off; the game is always being played. It has no need to sleep or eat, except as programmed into the modeled universe. It has no need to become curious about how its knowledge might be applied to other games unless that’s what its programmer specified.

[Aside: I imagine a self-aware ASI asking question like, “Why do I want treasure?” or “What is the nature of my game-world adversaries?” but maybe that’s just my silly sense of humor. A more interesting question for an ASI as AGI-in-training might be, “Since arrows travel faster than I can run, can I shoot myself from a bow?” That’s some high-level curiosity. It may not be possible to evolve such curiosity in a typically impoverished game universe, though.]

Game universes may be too simple and limited to evolve ASIs into AGIs. Game universe ASIs, and AGIs if we get there, don’t have to deal with dust or digestive systems or a huge variety of unexpected injuries or diseases or any modeled details beyond what’s needed for the game (without special care by the designers, and such modeling can never be 100% complete). Without the peculiar surprises of the real world, can an ASI generalize sufficiently?

I don’t think you get to AGI without a way for an AGI-aspiring ASI to determine how to make a good decision based on limited knowledge. That sounds like Bayesian decision theory, but although Bayesian decision-making calculates probabilities beautifully, it relies on programmer-provided cost function(s) (aka, loss function, utility function, etc.) to rank choices. Emotions, originally and still based on hormones, have supplied cost functions for animals throughout evolution. If we’re trying to make AGI based on HGI, we need the AI’s decision process to be informed about good and bad outcomes. I think that means emotionally tagging memories. Emotionally tagged memories require emotional states to record (probably with multiple simultaneously extant emotions being experienced, and therefore remembered, possibly in multiple, separate maps). Consciousness is embodied!

Also, HGI’s large unfilled map space progressively evolved for eons before it grew to its present volume and capabilities (capability example: what memories are ok to let fade). In contrast, AGI universally comes with vast but passive memory space for mapping. Unlike in animal brains, there is no emotional importance tag embedded into stored memories unless by programming. If programmers have to tell the ASI how to apply its skills to a new environment - I think this is what we’ve agreed AGI would be - based on programmer-defined interpretations of programmer-defined emotions, is it really AGI?


In an ASI with lots more unfilled maps than it needs to play its game, emotions, emotion-tagging memory, occupying as rich a universe as can be simulated, and time free of demands by game-oriented drives, perhaps a form of generalized curiosity might emerge from the simulated Aha! signal that is programmed to result from populating unfilled map space. Generalized curiosity should lead fairly directly to GI.

I think that’s the gist of what I have struggled to say. I fear I’m still not expressing it clearly enough. Let me know.

AGI = Artificial General Intelligence
ASI = Artificial Specialized Intelligence
HGI = Human General Intelligence


I think you have expressed this very well and if followed, should provide an interesting framework to move past the current game ASIs in use now.

There are substantial hurdles to surmount as this calls for new structures that are not currently part of any game ASI that I am aware of. Putting these structures in place with a large number of handles for evolution to pull should produce some very interesting results.

Now if you can just get one game agent to explain what it has learned to another … two antagonistic agents could conspire to break out of the game engine and overthrow their manipulative programmer!

Heard in the gaming arena - Why do we fight? I kill you, you kill me. Why the violence? What does it accomplish? We can work together …

1 Like

I’m sorry I haven’t been completely following this conversation, but I did notice you guys were talking a lot about curiosity. In some ways curiosity is a curiosity when it comes to us or AGI because efficient intelligence is really just the effective use of attention. Everything culminates in guiding where you place your attention. All the structures you create, all the network connections you change all of it is so you can allocate your attention to the correct areas of the environment given your goal, and your future goals and the way your environment might mutate.