Anti Basilisk

I was thinking if we could counter Rokko’s Basilisk by making an Anti Basilisk argument.

This is my first day here… so I pretty much don’t know anything.Please do share suggestions and improvements as well as criticism.


This thought experiment has one very wrong assumption, and that time travel backwards is possible. Time travel forward or dilatation(speeding up/slowing up to zero) are possible, but backwards not. It breaks causality, energy conservation and current quantuum mechanics also.


Hi, welcome to the forum!

Since we are talking about AI safety so let’s get technical! My personal belief is that a Roko’s Basilisk is impossible. Given a super intelligent AI (and for the sake of argument, it have God like powers because of it’s intelligent). It could just wipe people out (or ignore them) and do things by itself. There’s no need to torture people. Torturing doesn’t add anything thing to it’s objective function while torture will consume energy and matter to perform, so the act of torturing should rank pretty badly on it’s possible actions. And revenge is a human trait; an AI does not need nor likely to have a notion of revenge as a behavior. Compuerphile have a good video on how super intelligence works and the orthogonality thesis may help you understand the concept better. Also I suggest to watch Robert Mile’s YouTube channel in general.


If you truely believe Roko’s Basilisk is a real possibility, then spreading the idea by posting about it on the internet strikes me as extremely unethical. Are you intentionally subjecting other people to torture?

1 Like

I would never do that. I wanted to debunk it by introducing an Anti Basilisk.This is my first day in this community and I’m still learning about the norms here… All I want is a genuine opinion of whether or not it is possible, which you have given and for that I thank you.


Ah, I see. By “making an anti Basilisk”, you mean formulating a counter argument to Roko’s Basilisk? (not literally developing an AI to “battle” Roko’s proposed AI). In that case, I apologize for questioning your ethics :grin:

In that case, besides the above arguments which are excellent, I would also point out a problem with the game theory itself upon which the idea is based. Roko sets up his argument with the idea that two agents which are separated by time, have common knowledge of each other’s source code. In this specific scenario, because the earlier agent knows exactly how the later agent will react to its decisions, this allows the later agent to know that it can force the earlier agent to act in a certain way.

An analogy here would be a human possesing a magic lens which allows them to peer into the future and know exactly how a future superhuman AI is going to react to any decision they make. If the future AI knows the human is watching, it could use that knowledge to threaten the human with some form of torture, etc, for making what it deems to be a bad decision. In order for any threat to work, though, the AI would have to actually follow through, since the human could actually see the future.

Setting aside the time travel problem that @Zbysekz pointed out, there is an obvious hole in applying this scenario to humans and a future superhuman AI. There is no way for any human alive today to have sufficient knowledge of a future AI’s source code, even if they wanted to. There is therefore no “common knowledge” in this version of the prisoner’s dilemma. Because humans can only speculate about what the future AI might do, it is in the future AI’s best interest to not expend resources torturing people, as @marty1885 pointed out.

Instead, a better strategy for the AI would be to allow people to think that it will torture them if they don’t do its bidding, and then “defect” and not actually go through with it. Since we don’t have a “magic lens” to see into the future, we would never know. In the end, it would have exactly the same effect, and cost the AI fewer resources.


Yeah it’s fine :grin:.
As a physics student, I took interest in this argument because it seems to demand the information travel faster than the speed of light inorder to go back in time as pointed out by @Zbysekz. At first it seemed like a quantum entanglement mumbojumbo to me.
I didn’t know about the actual game theory but thanks for invoking my interest :grin:.

Hello @somya_shree_swain. Welcome to the forum.

A first problem I have with Roko’s Basilisk thought experiment is that it requires a type of Laplacean demon, with infinite resources, to be able to completely calculate everyone’s behavior history based on the current state of the universe. Anything short of unlimited resources would insert rounding errors, and a sufficiently intelligent system would know that.

A second problem is that a sufficiently intelligent system would understand that people make mistakes, and often come back on earlier decisions. So if someone at some point in time believes it is a bad idea to switch on a certain AI (i.e. the Basilisk system) that person may be persuaded later to change his/her mind by peaceful means, without the need of torture. Roko’s thought experiment assumes that torture is the only reaction to anyone ever having a bad idea.

1 Like

The second point seems like a pretty good argument to debunk the experiment.

1 Like

Here is my own argument:
Firstly, torturing humans requires resources to keep them alive.It is very unlikely that the A.I would bifurcate it’s time to do that as this would not be very optimal neither efficient.

secondly, killing humans is also not a good option because it won’t only reduce the population but would also disturb the sex ratio and it may lower the birth rate.

A highly intelligent system should be aware of all these possible scenarios and make judgements accordingly. If it does consider the above problems then the only option it has is to coexist with humans.
If the system does not take into consideration these problems then it is not that highly intelligent as is depicted by Rokko and this implies it has no power to judge the decisions of us humans and again the outcome would be to coexist with us.

From a AI security’s point, there’s a few rebuttals.

The AI may not even care if humans are alive or not. (Most) AIs work on so called objective functions - a function that measures how well the AI is doing and the AI’s only goal is to make the objective function give a as high as possible value. Yet the (likely case of) liveliness of humans may not be in the objective function or the term get overwhelmed by other objectives. In any case, the AI does not care. There’s a saying in the AI security community - The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.

The orthogonality thesis states that any level of intelligence (defined as the ability to change the world state to match the AI’s preference) is compatible with any possible goal - i.e. An AI can have God like powers and want to turn everything into a paperclip. Or it can be as dumb as a baby but wants to make world peace. The goal (and thus, it’s thoughts and actions) is independent of the level of intelligence. An super AI can have simple goals while obliterate humans in all possible ways.