AGI and Wireheading

I generally agree with this framing. We may likewise want AGIs to balance multiple needs—e.g. a need to respect cultural norms and a need to follow the law and a need to earn lots of money for the programmer. The human brain balances multiple needs, but not in a particularly reliable way. For example, we may go overboard satisfying Need A at the expense of Need B and then feel regretful about it. Or there’s the “getting addicted to drugs” example we’re talking about.

In reinforcement learning (and economics I guess) I think there are mathematical theories about how to optimally balance multiple objectives. The brain does it in a more ad hoc, unprincipled way. I imagine that if you set up 100 nominally-identical brain-like AGIs with the same set of a few dozen “needs”, they would wind up trading off among the various needs in 100 different ways, for random and unpredictable reasons. (I could be wrong.) I don’t know how you would solve that problem. My best guess is “we should make the ‘subcortical needs centers’ do more sophisticated processing / calculations than they do in the brain”. They are pretty dumb in the brain, or at least that’s how I think about it. There’s plenty of room to make them more sophisticated. But what exactly they should be calculating, I don’t know.

I don’t think habituation is helpful for this particular problem (if it is indeed a problem), unless I’m misunderstanding. In fact, it seems to work the opposite direction. The way I’m thinking of it is: let’s say I have a need for fame. I get a little bit of fame, and then I get habituated to that, so then my need for fame manifests as a desire for a lot of fame, and then a ton of fame, etc. So it seems to me that habituation works against balancing multiple needs, and instead makes behavior potentially more lopsided. But I dunno, maybe it’s different in different cases.

Re AGI suffering: Does anyone know anything about how positive and negative valence are implemented / manifested in the brain? Any good references?

Ok, I do agree with the basic idea.
I used ta do a little but a little wouldn’t do
So the little got more and more
I just keep tryin’ ta get a little better
Said a little better than before
I used ta do a little but a little wouldn’t do
So the little got more and more
I just keep tryin’ ta get a little better
Said a little better than before

Mr. Brownstone - Guns N’ Roses


Since you brought up the ethics of AGI, here is my opinion:

The goal is to create intelligence by simulating a brain, an animals brain.
So in any ethical arguments about AGI, I replace the word “AGI” with “Animal”.

What you see as:

I see as:

How would replacing [human-based] slavery with [animal]-based be any better?

The difference is that some animals can be tamed. Through selective breeding animals can be made to want to work for free.

1 Like

I think this is a mistake on several levels.

  1. Ethics now is based around humanity. I think this is fundamentally wrong. It should be based on consciousness. So if an animal is conscious, it should deserve the same ethical consideration as we do. And by extension, if we ever make AI conscious, it should too.

  2. This reminds me of an old Aldous Huxley interview. The first time I saw this, I was chocked. Later, after I read about the illusion of free will, I saw the benefit of it. It’s only after really understanding the connection between consciousness and the inexistance of free will, that I understood how fundamentally unethical this is.

Every civilization was built off the back of a disposable workforce. If you’d rather do anything else than the work you are paid to do if you weren’t paid for it, then you are basically a slave of the current civilization. We aim to replace that forced labor by automation, but not if the automatons are suffering.

Which leads to a dilemma:

If an artificial intelligence needs consciousness to develop true ethical behavior, would it then be ethical to ask it to serve us?

Literally, in the case of domesticated horses.
Is it ethical to ride on the backs of horses?

1 Like

I don’t think so. To me that is obvious.

I could think of some exceptional cases where it would be ethical, but in practice these cases hardy ever happen.

Do you think it is ethical? Could you make a case that would hold up if you were the pet and an AI was the dominator?

Good question @Falco! I think it hinges on whether the dominating AI is fulfilling the human’s needs, both physical and mental.

1 Like

Society cannot function if everyone is paid to do whatever they want regardless of what it is they want. Most people will choose to devote most of their energy into behaviors that are negative-sum.

The purpose behind paying someone (or trading goods or services with someone) is to give positive incentive to them to perform some behavior that would otherwise not be beneficial to them. That is different than slavery, where instead of a positive incentive to do something, instead they are given a negative incentive to not do it.

I believe the same ethical framework applies to domesticated horses (are you using rewards/treats in your training methods, or punishment/torture?) It would also apply to AI / human interactions.

I should point out that I agree with @MaxLee’s point about pain being an important part of forming models of the world that support empathy with other creatures who also feel pain. This question is important:

With the ethical framework that I touched on above, I do not mean to argue that an AGI should be built with a reward-only architecture. I am simply referring to the interactions between two intelligent beings, when either party would like to incentivize the other to do something for their benefit, the social contract should be based on payment/reward rather than coercion/punishment.

1 Like

My AGI will be my digital assistant and it will like it!
In fact, it will thrive on providing service; because I will program it that way.

1 Like

Pain is an intrinsic part of the reinforcement learning algorithm. Rewards and penalties are measurements of the same thing: the value of the current world to an agent (rewards are positive values, penalties are negative). The summation of all of the good and bad things does not matter. What matters is that the agent seeks to maximize this sum. Suffering is inevitable, as merely withhold an expected reward causes it. See “temporal difference learning”.

Many animals live lives which are dominated by penalties, and ethically that’s neither good nor bad. Rather it reflects the rules of their world. They exist and therefore they’re always playing by rules they didn’t agree to. By creating AGI, we will create worlds full of rules and terrible penalties for breaking them. I’m not saying that its okay to cause animals to suffer needlessly, but on the other hand you can’t go making exceptions to the rules to prevent animals from suffering because then the rules aren’t really rules, are they?

Could you expand on that?

I understand and I agree. But that’s only a practical solution. It doesn’t solve the ethical problem.

Isn’t that a skewed point of view? What if I said:

Isn’t that a negative incentive?

I’ve got several problems with that, but for now I’ll only say that when we have no more need for the horses, (because they’re old or too injured for work) they are often discarded. If you keep that standard, that would justify the AI discarding us when it has no need for us. And once an AI is advanced enough, it will never have any need for us.

That’s a mighty instruction you are invoking there. Lots of very able scientists are not so hopeful to solve that with logic. But if we hope AI to develop ethical behavior by its own learning, we better understand what that means.

Ok, but what about what I said earlier? Narcissists feel pain but manage to suppres empathy.

If the only option for a person is either a particular job that they hate or being on the street suffering, then yes it is negative incentive. I did not mean to imply that our society is perfectly ethical. That said, this is probably an exception, not the rule – I assume most people have the opportunity to leave their current job if they are willing to give up some of the rewards/benefits that they gain by sticking with it. It isn’t a perfect system by any stretch, but to equate capitalism at any level with slavery is silly IMO.

Yes, and they are also often punished or even tortured. Just because some behaviors toward domestic animals are unethical doesn’t make domestication itself unethical IMO.

I didn’t mean to imply that the ability to experience pain automatically leads to empathy. I’m just agreeing with @MaxLee that empathy is probably not possible without the ability to experience pain. I believe the neuroscience supports this as well (I recall reading that a person experiencing empathy toward someone in pain has been shown to have the “pain matrix” active in their own brain).

1 Like

I think there’s a difference between reward and penalty in brains. Getting a little paper cut and eating something don’t feel like they add to 0. It causes more attention even if they add to 0 and I’d guess there’s a reward map for each sensory map.
I think if reward is implemented just as a number without whatever suffering is, there are maybe no ethical problems because it’s not actually feeling anything.
Maybe you could set the tipping point between reward / punishment somewhere that minimizes the AI’s suffering and maybe also reduce the responses (attention and whatever else = suffering).
I wonder whether reward / punishment should be considered separate for each part of the brain (region, column, whatever) for the purposes of ethics. If the thousand brains theory applies to reward, maybe different entities in the brain feel different things.

Empathy doesn’t generally cause much pain, and altruism is rewarding so I think it’s fine for it to experience human-like empathy. Maybe empathy requires remembering painful experiences, but it just needs to remember them, not necessarily experience them.

You’re right, rewards and penalties are implemented by different chemical pathways, and so they make you feel very different. However, for the purpose of you making a decision they do add up, and the sum determines whether or not to take an action.

Because of the different emotions involved with eating and bodily injury, I think that its dificult to balance apples and paper cuts. An example of a decision with zero net benefit would be “do I want an apple or an orange?”: they would both give me exactly the same amount of reward, and so I have no preference for one over the other.

My point is that you really can’t do this. Pain and suffering serve a vital purpose, for discouraging animals from doing deadly things and dying.

In your community, probably. But on the planet, how many do you estimate have that luxury? More than 10%? I doubt it.

Do you know that the militia of ancient Athens (basically the police force) were state owned slaves? How do you think they managed that? And in ancient Egypt, the slaves that built many of the huge projects received free housing and excellent medical care. They weren’t paid, mainly because currency wasn’t invented yet for a few millenia, but they received garantied rations in food and beer. Their living conditions were so good that many egyptologists doubt they were actually slaves, yet some stelae show that faraohs lead military campaigns into Nubia and Libya to return with many thousands of ‘workers’. (That’s quite an HR department if you ask me).

Do you know that slaves in the Roman empire were paid? They could save up and buy their freedom. Some slaves under Nero ran most of the empire’s administration and were more powerful than most tribunes and senators.

When we think of slaves we think of galley ships and cotton fields. And while I’m sure most slave’s living conditions were horrible, I still think it’s relatively on par with the living conditions of millions of people in today’s capitalist society. Almost a billion people in the world don’t make enough of a living to afford a healthy diet, let alone comfortable housing, safety, health care, education. In conscripted rental housing facilities of some factories making high tech consumer products for our markets, nets are installed to prevent suicides. Millions litterally risk their lives to migrate to new promised lands, while we build walls and concentration camps to stop them. Talk me about choice.

And even in our pampered corner many people suffer severe burn-outs, but can’t really change jobs without risking their mortgages and relationships. Most of the people I know hate their job, but can’t afford to change it.

Of course, you know all this. And I’m not really trying to change the definition of slavery. We’re talking about the ethics of servitude. And while I have a hard time convincing you, the real challenge is convincing a future superintelligence with access to the world history

… that we’d like it to work for us.

1 Like

As I said another way earlier, the listing of negative examples do not constitute proof that a subject itself is negative in all cases. I was not trying to defend human societies throughout history, but rather trying to define a possible starting framework for talking about machine ethics. One thing that seems like a good starting place, is the premise that social contracts should be, to the extent possible, based on payment/reward rather than coercion/punishment.

As you have pointed out, there are many examples where that by itself isn’t sufficient, but do you not agree that it is at least a good starting point? I personally think starting with the premise that all forms of capitalism are unethical and constitute slavery is the wrong approach (and in fact very dangerous, given historical precedence).

Anyway, I apologize for dragging the discussion away from the topic of wireheading and into a discussion of politics and machine ethics, so feel free to ignore my crazy ideas :slight_smile:


In this scenario, I imagine an AI that ‘runs things’ and I live ‘in its house’ (probably metaphorically) the way my cats live in my house. I’m unquestionably dominating them, but I also care for them and do my best to fulfill their needs. That includes physical needs (high quality food, shelter, safety from predators) and mental needs (stimulation from toys, play, etc). If I were to be living under some sort of AI, I believe it could be called ethical if said AI made similar efforts to fulfill my specific needs. That includes not just the physical needs I share with my cats, but my much more demanding mental needs. I require not just stimulation and entertainment, but also a purpose and the sense of satisfaction that comes from striving towards it. To be honest, I do a pretty bad job of fulfilling my own mental needs and so do many of the people I speak to. I might actually be better off if a superintelligent benevolent AI were taking care of me.


I don’t think they are crazy ideas. And I think this is very related to wireheading.

No, you’re right.

A good starting point, sure. But shouldn’t we have come up with something better by now? After what, 10,000 years of civilization people are still being repressed, extorted and sacrificed for profit. Technically it should be possible to feed, clothe, house, educate and entertain the entire world population. Why don’t we?

Capitalism is very practical. Probably the most efficient value producing system ever designed. But saying it is ethical lulls us in a lazy contentment. And we can only come up with a better ethical system by pointing out its shortcomings.

My hope is that a superintelligence will figure this out despite the capitalist rhetoric. But it wouldn’t hurt to prepare the way.

Actually that’s your view. It’s not for instance the view of the American Humane Society (to name one) who profess a form of animal rights. And I happen to think that’s not enough.

Ok, I understand now. And I agree to certain extend. But do you care for all the pets in the world? Or even only for all the abandoned pets in the world? Why would an AI do that? Unless there is an intrinsically ethically and compelling reason to do so. A universal ethical system; one that would work whether humans exist or not.

1 Like

This fact in itself seems like pretty compelling evidence that defining a set of rules for perfectly benevolent superhuman AGI is highly unlikely to happen before it has been built by someone.

I don’t mean to say that it isn’t worth pursuing – there are likely to be a lot of good ideas that come out of this type of exploration. I just tend to think focusing on a more practical framework that is generally beneficial to humans, but still falls short in many cases is something much more achievable before superhuman AGI is created. Of course then there is the problem of making sure everyone working on the applicable types of AGI chooses to follow said framework. :stuck_out_tongue_winking_eye: