Yeah, I put this diagram in the first post:
I feel pretty strongly that “the AGI is doing something that nobody wants it to do” is worse than “the AGI is doing something that at least somebody wants it to do”.
You have to be awfully cynical to believe that a catastrophic accident with out-of-control AGIs that results in human extinction is better than the alternative. Right? (Granted, some people do believe that—this gets into a very weird topic involving so-called “s-risks” and “negative utilitarianism”. I don’t think those arguments are compelling but maybe let’s not go there.)
So my answer to “would you want an aligned AGI” is “Yes that seems less bad than the alternative”, and separately I strongly endorse efforts to ensure that AGIs will wind up being used for prosocial ends, whatever form those efforts may take, which is not my own field of expertise.
And my answer to “It seems it is ourselves, not the AGI, that we need to learn to control.” is “Why can’t it be both?”