ML and Deep Learning to automatically create AGI?

Bitking · August 27, 2019, 10:35pm

As a prior poster pointed out - this is a monstrous (infinite?) search space. The proposed “starting simple” has no incentive to evolve in any particular direction resulting in random walks around the starting point. Random variation will explore vast numbers of useless and unworkable solutions.

Evolution started with a living thing that had to meet certain minimal properties to live and reproduce.
There is an unbroken chain from that point forward that every single generation and EVERY SINGLE EXEMPLAR had to meet these requirements to continue. There are NO exceptions. This list of requirements did not start simple - it was an irreducible set of features that were varied. This has the charming property of dramatically reducing the search space for enhanced versions - all successful variants will meet the requirements that it can live and reproduce. The need to compete drives innovation and offers a further dramatic reduction in time spent in useless variations.

In all cases the solution revolves around the core model of a living thing.

If you intend to design something and let some evolutionary refinement technique clean up the rough edges I see that as distinctly different than automatically creating an AGI. In this case - YOU designed the AGI.

As far as your final point: I suppose I’m working off the premise that the simplest unit of intelligence looks somewhat similar to the most complex unit of intelligence. Therefore if we can find the simplest we can iterate to find the deluxe model. If you evolve a spark plug you did not make a car. There is no driving function to make a spark plug more “car like.” There is a minimum set of features to be a car. You can fiddle with the bits after you have one but until you get the basic cluster of features nothing you do with the components will tweak a “car” function. Moving away from the car analogy; I surmise that an AGI will have a large number of interacting parts that do nothing useful until they are combined in some basic configuration.

For example - messing with HTM parameters - what changes make it more compatible with an AGI? More sequence? Better discrimination? Better pattern formation? Better anomaly detection? You can tweak any or all of these and have no idea if you are getting any closer to making an AGI.

jordan.kay · August 28, 2019, 3:15pm

I’ve thought about this question all yesterday and I think I’ve got an answer.

What exactly is it trying to do? Learn the environment. Now, that’s an interesting goal because it’s always within a context - Learn the environment in order to… gather resources, fight off invaders, find a mate, etc. etc. But to do any of these things, and anything at all in the environment you must get a view of it, you must develop a distributed mental model of anything in the environment that can help you achieve your goal.

In ML I know we start with the goal (the context for why we should learn the environment) and then work backward, and I think that’s what you’re asking for. The issue I take with this is that as the wording goes the agent then learns a policy not the entire environment itself in general.

Seems to me that we don’t want our AGI to learn a policy. Seems like the wrong approach, because, when a baby is born it doesn’t start learning a policy - what has it got to do? It has no goal except the implicit goal encoded in its brain’s baseline structure given it by DNA of figure out how to make sense of this weird and wild data you’re getting from everywhere.

But of course, we’re not making babies here. I just think we should keep that in mind rather than focus entirely on the traditional policy-oriented frame of reference. We’re actually not trying to get the agent to learn a set of triggers and behaviors (a policy), we’re trying to get it to learn how to make an abstracted memory structure upon which it can embed millions of learned policies.

Ok, But what does that actually mean for what goal we should give the agent? Because at some point - even if it’s programmed to walk around randomly for a while - it has to have a goal. Either it has to develop or infer a goal (I think that’s the ideal), or we have to give it one explicitly, but either way, it must have a goal: what should it be?

I think it should be something akin to survival. That seems to be the baseline goal for all things living (at least until reproduction might overtake it) so let’s start there. I think another way to frame “survival” is “seeking that path of most options.”

And that I think gets us really close to our solution to this answer. If it’s ideal that the brain itself come up with the goal of survival, and if we posit that the SUI will form a hierarchical brain, and if the brain unfolds predicted futures into the lower layers as actual behaviors to perform to generate the higher-level predictions, then all we need to do is make the SUI tend towards paying attention to contexts with higher variety.

Let’s talk through an example to decode the previous statement: Agent is in the environment, information about the agent’s location in the environment (the environment state at current timestep) flows into the lowest layers of the hierarchy. The information flows up. The highest layers understand, more or less, where the agent is (it’s location). By this, they know what other states it can get to (they have a union of predictions for where the agent could go, what it could see next). That union of predictions about the future gets sent down the hierarchy as the agent’s current context by which it should interpret all new incoming data. If those predictions, the context, tends towards variety, the agent will be moved to see novel things; things it has not seen in a while, or ever. If the context is instead biased to predict things that are known the agent may get caught in a loop where it prefers to see the same thing over and over and over.

Now, this is in no way completely understood by me, or anyone else, but I do understand the fact that goals for the agent can and (ideally should) be internally generated as predictions made by the hierarchy itself. Which is fundamentally created by SUI since the repeating smallest unit of intelligence is the entirety of the network itself.

Now if none of that seems valid or makes sense, then I think there’s one other thing I could say that would be of service to the conversation: Any goal we could possibly give it is technically none other than pathfinding. The agent lives in an environment. Manipulating a Rubik’s cube is no different whatsoever than walking through a higher-dimensional maze. It’s all pathfinding to a particular state of the environment. Everything is. If the agent is in a sensorimotor feedback loop with the environment then it is always ever only pathfinding through the state-space of that environment.

So if we have to give it a goal like “gather all these things and put them in this room” that’s just a high-level abstraction of many pathfinding tasks to form one pathfinding task: find the state of the environment where all the things are in this room.

That’s all the structure is doing when it comes to its interaction with the environment: pathfinding. So I’m not sure if that helps answer your question, but I think it’s important to understand we can unify all seemingly different types of goals as truly one goal this way.

Lastly, just to get down to the nuts and bolts of what to actually do:

Early on I suggested that we merely put the agents in the environment for some time, and allow it to interact with the environment according to its own nature, without a goal or policy. Only in that way, can it explore the state-space of the environment according to its own policy, embedded in the structure of the hierarchy itself, determined by the makeup of its particular seed SUI. Then, after it’s explored a while we can give it a goal to a particular state and see how it goes about learning to achieve that goal (as opposed to seeing how it goes about only achieving that goal).

That’s what we really care about if we want AGI - we want to build a mind that finds the easiest path to learn the widest variety of new things. We are not trying to find a mind that finds the easiest path to a goal.

I think we need to let the agent learn what it should learn to pay attention to as much as we should have it learn what to pay attention to.

Maybe there is a way to give it explicit goals all the time and see how well it learns to learn. How do you think that might be implemented?

Paul_Lamb · August 28, 2019, 5:19pm

What do you think of the goal simply being to survive? The critter could initially operate in a collection of highly complex worlds, but where the goal initially is very easy to achieve. For example, after a few seconds of the world resetting, an anvil falls down at the place where the critter starts.

When the critter learns to move out of the way and not get crushed, then make the death trap a little better. This could even be run by a GAN, where instead of a death trap, you have two critters playing the roles of preditor and prey.

jordan.kay · August 28, 2019, 5:22pm

Yes, I like it. But I want that critter to learn why it must move in order to survive, rather than automatically generate bots that tend to move the right amount to live in our particular environments.

So I’d like to, after we let it explore, give it states of the environment to pathfind to then see how well it explores new territory. perhaps it can learn that new territory can be treacherous and therefore it can be slower or more careful.

Come to think of it I really like the idea of predator and prey, but I think that’s a level 2 environment (it’s not fully observable since you can’t fully observe the enemy’s mind).

Paul_Lamb · August 28, 2019, 5:31pm

I was thinking classes of procedurally generated environments, but yes that is a good point. You would need to counter the prey behavior of just running around at max speed in some sort of loop, where the predator could never catch up with it. This would require the environment to be complex enough that the predator could form complex strategies that require understanding the environment to use to its advantage (similar argument for how the prey would then need to learn more complex strategies for getting away).

Obviously I am a big proponent of multiple potentially conflicting goals (in the form of needs) and action selection based on emotional context, but in theory you might be able to get away with a single fitness goal for each critter.

jordan.kay · August 28, 2019, 5:34pm

I think you could - AlphaGo is a predator prey simulation on both sides.

Paul_Lamb · August 28, 2019, 6:10pm

BTW, I also agree with @Bitking’s point. There is no guarantee with the above strategy that you would ever produce general intelligence in our lifetime (the results would probably be pretty interesting, though, and might give an idea about as yet unknown challenges).

I think this would stand the best chance of succeeding if we can start with something as close as possible based on what we know, to something resembling our end goal. A similar predator/prey strategy to the above could still be used, but instead have the goals originating from “sub cortical” structures rather than emerging from the SUI itself.

Paul_Lamb · August 28, 2019, 6:45pm

One way to avoid the potential problem of evolving pure predator / prey instincts and little general intelligence would be to use the same evolving network of SUIs in both critters, and allowing the “lizard brain” to evolve independently (perhaps this part could be collections of small DNNs).

You could get even crazier and add a third critter to the system which is responsible for making the modifications to the SUI core and network (itself also using an instance of that same network). This might be some improvement over random modifications (at the very least, there are likely certain negative design patterns that it could learn to avoid).

Bitking · August 29, 2019, 12:29pm

This paper seems related to the current question of what initial functions you would give your evolving AGI to start the evolutionary process:

From the paper: In engineering, one begins by precisely defining a challenge to be solved,and then devising and testing plausible mechanisms for solving it. In contrast, evolution does not identify challenges at all.Instead, it modifies the developmental process of individuals within populations and then, through natural selection, favors those variations that happen to accomplish something that used to be a challenge. The sequence of changes accumulated over millions of years is never directed in any way, never aimed at any functional purpose, even though it may ultimately achieve amazing things. Consequently, the organization of the resulting systems is not determined as much by categories of functional challenges as by the constraints of the kinds of developmental changes that are possible at each stage of evolution. This means that the majority of sensible mechanisms that may be optimal in some way never even entered into the game, rendering normative analysis ineffective. Furthermore,even defining what the functional challenges are is fraught with pitfalls, because each subsystem we may find in modern animals is always a specialization that emerged within a broader ancestral system, and even modern humans are a work in progress.^

sheiser1 · August 29, 2019, 9:19pm

I agree, this would be ideal and really drive the G of AGI. I’m just concerned that this sort of meta-goal, of learning which goals to have, is quite a can of worms onto itself. I want to make sure that the problem we’re setting up is tractable enough so that there’s:

a shorter rather than longer timeline to a proof-of-concept
maximal real world practicality of the tech (why I’m partial to real rather than simulated environments)
maximal tangible gain on the frontier of AI (intuitively clear to non-AI people)

These are just my priorities, I know there are many fronts to gain ground on.

I can see that defining a single set of goals amounts to many pathfinding tasks, which is a compromise on the G of AGI. However imagine we could come out and say:

“We have a mechanism which can generalize well across defined pathfinding tasks, robust across varying and highly dynamic real environments!”

The ability for path-finding is narrow compared to mammalian intelligence, but its still quite general compared to the most cutting edge robotics right!? I gotta think this intermediary A(g)I would be quite a juicy fruit, and put a lot of gas in the car on the road to AGI.

jordan.kay · August 29, 2019, 9:26pm

My thoughts exactly. In fact, I think being able to generalize all tasks into one type of task is a major plus.

So then, what do you think we should on the goal-oriented AGI front?

sheiser1 · August 29, 2019, 10:39pm

100% agree!

I thought you had a good idea with this for one:

My starting point for conceiving a goal-set is: If our system worked perfectly, what would it empower us to do? I think this a common pivot point for us to sketch the initial bounds of generalization capacity.

For instance within this task of finding objects in certain places and moving them to certain other places, what traits must be required of the objects and the environments? Must the objects be small and solid enough to be picked up by a robot? Can the objects be very small or moving? Can the environment be dark? 3-dimensional? At sea? It seems inevitable that we’ll have to limit the scope in certain ways at first to prove the concept, but that’s ok to me as long as the learning/environment mapping mechanism itself is generalized.

Even within this limit in scope to certain kinds of objects and environments, we could still say:

“Find everything in room A which has traits 1, 2 or 3, collect them and move them to room B. Also there will be obstacles moving about the environment, some predictable and some chaotic.”

Though this is narrowing way down from true AGI, we still find ourselves faced with a BUNCH of non-trivial problems which need solving on the way to AGI, like how to handle attention as you discussed.

For instance let’s say there’s a direct path from the agent’s current location to a relevant object, but there are a number of treacherous moving obstacles along that direct path. The agent has to:

understand that these are obstacles
determine how easily it could traverse them
decide what to do (whether its worth daring the direct path or better to find a way around)

This decision should also theoretically depend on the location(s) of other relevant object(s) in the room, along with any other relevant factors like time constraints assigned to the task.

It seems we can’t build a robust agent to do this task without innovating like crazy, since the sub-tasks involved are themselves on the cutting edge! To my knowledge each of the following would be breakthroughs:

recognizing objects from all different angles in the room while moving about (learning ‘invariant representations’)
feeling out the objects for their properties and deciding how they must be handled and stored
efficiently navigating the realistically chaotic environment, where different obstacles move differently and pose different kinds of threats

It seems quite hard to test our system for all the desired traits in one iteration, so these could be some of the priorities to start with.

Toward the design of a testing arena for our system I think there should be:

a robot than has certain facilities like movement (on hard surfaces at least) and picking up objects
a real and non-trivial physical environment, containing real physical objects of certain qualities - some of interest and most trivial (like a house/office with numerous rooms or a production facility with numerous areas)
numerous moving objects within the environment (some machines some humans), requiring the agent to always adapt in real time

The goal could be initially to retrieve certain objects, and as long as the agent succeeds in navigating the environment the tasks themselves can be ramped up in complexity.

jordan.kay · August 30, 2019, 5:59am

I think if we can make a brain that learns how to manipulate an environment as complex, but deterministic and fully observable as a Rubik’s Cube of it’s own accord then we will have achieved the goal we are aiming at.

The Rubik’s Cube has a huge state space so the agent can’t explore the whole thing. It’s also very self-interactive; any action you take to achieve an interim goal has many ancillary effects across the rest of the cube. So the agent has to learn how to manage these complex relationships. (And by the way that’s not the same things as knowing exactly how it works in every way, most humans that can solve the cube do it in several large steps with memorized movements and don’t have a mathematical understanding of how all the interactions work, which is ok for the agent too).

However, the cube isn’t too complex, it’s not out of reach. The environment is fully observable, you can see the whole thing at all times. It’s deterministic, rather than probabilistic. Moreover, it’s highly symmetrical.

In short, it’s complicated but not unruly. So I see it as the goal.

If we had AGI capable of managing the cube environment then we’d have AGI capable of learning all kinds of complex systems. We’d have agents that could explore a database and learn to navigate it using SQL. Moving objects in a room - so long as the semantic encodings and other “bodily” things are taken care of by humans - would be a cinch. If we could crack the code of AutoEncoding semantic representations too, we’d only have one major thing left to deal with: environments that are not fully observable.

Anyway, I’m not saying we should give it a messed up cube and tell it to solve it as a goal. I’m just saying we should give the agent a cube and let it play with it. Once it’s played for a while, we then start asking it to do things, “Hey, can you solve this side of the cube? Hey, can you make this pattern out of a solved cube? Hey, can you solve the whole thing?” Those are all pathfinding questions. And the path to the goal for any part of the cube interacts with the path to solve all other parts of the cube. If we have an AGI that can naturally deduce and manage those interactions, just by playing with it, I think we will have achieved our ideal goal.

Bitking · August 30, 2019, 6:43am

Hmmm - rubrics cube?

I am reminded of how “they” thought that making a chess player program would be a big step towards AGI.

Turns out - not so much.

Paul_Lamb · August 30, 2019, 12:41pm

The main issue I see with the Rubik’s Cube strategy (and honestly with any strategy mentioned thus far – not claiming to have the solution) is that one of the defining features of a “general” intelligence is a high degree of transfer learning.

For example, AlphaGo is very capable of navigating a virtually infinite state space, but that game is all it is capable of doing. If you took the best trained agent from that system and plopped it into the Rubik’s Cube world, it would have little to no advantage compared to another untrained agent which hadn’t learned anything yet.

How do you put pressure on a system to form general solutions rather than specializing toward only the tasks that it is being judged on?

jordan.kay · August 30, 2019, 1:24pm

I’m not even concerned with transfer learning across disparate domains. I see that as more than can be hoped for by this method. But if it learns one environment, then we place it in another it should be able to learn that one just as naturally. And differentiate between the two contexts. That’s my goal.

Perhaps, if we get the right SUI architecture it will suprise us with its capability to use analogous structure already generated naturally to learn new things, but I’m not counting on it.

Paul_Lamb · August 30, 2019, 1:55pm

Perhaps then the goal is something like DNN which is immune to the catastrophic forgetting problem?

jordan.kay · August 30, 2019, 2:32pm

Yes, I think so.

One more word on transfer learning, though, I’ve said in the past that AGI (true, pure, full agi, not merely an AGI-like agent as we’ve been discussing) seems like it’s nothing more than ‘generalized transfer learning.’

We’re all familiar with the idea that slightly different areas of the cortex learn slightly different patterns. Because of this they structure themselves slightly differently and have a slightly different protocol from one region to the next. The brain achieves generalized transfer learning at least in part because each node in the network is doing translation work from every set of nodes it listens to, to every set of nodes that listen to it. It has a mutating language across the entire network, in all directions.

This is the great pitfall that we have in our unintelligent distributed computer systems, I think. We try to fit the entire network into learning the same protocol rather than their own local protocol. I would be pleasantly amused if this ability to have as many protocols as there are nodes in the network naturally falls out of the correct adaptive SUI structure that generates the network. In other words, I’m saying I think the SUI is really the main key to AGI success because everything must flow out of it.

rhyolight · August 30, 2019, 4:00pm

I wholeheartedly agree!

bkaz · August 30, 2019, 10:52pm

Right, but I think it’s a lot easier to find it through developmental neuroscience. Such unit must genetically determined, vs. learned. Which means that even lab-grown brain organoids should have it, if they get them to develop long enough. You can’t test them on any specific task, but we can do a structural pattern discovery. In adult brain, it’s very hard to distinguish innate from acquired.

Topic		Replies	Views
A response to "Building machines that think and learn like humans" Lounge help-wanted , papers	10	2033	March 27, 2018
“Intro to Brain-Like-AGI Safety” blog post series General Neuroscience	36	1297	June 4, 2022
Will we ever see AGI? Lounge agi	10	1708	January 1, 2019
A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex Related Papers	60	4525	May 16, 2019
Generalizing the principals Community	12	730	April 21, 2021

ML and Deep Learning to automatically create AGI?

Related topics