Tradeoff between generality and optimallity in regards to AI alignment issues

On AI alignment, I sometimes fear that if a machine has to learn values based on observing humans and our behavior over history, we’re done for.

What I consider is that even if we have machines that need to learn from scratch (akin to an infant), if they have some way to pool their accumulated knowledge in a central way, one day of human experience, for 1000 machines, would equal multiple years of experience. It would be like day 1 == infantile state. Day 2 == 3.5 years old. At least in humans, that age then shows an explosion of learning and curiosity. Imagine all the “why” questions coming out in day 2. After a week, if it were limited to human learning rates (and there’s no reason it should be), we would have an intelligence that is older than most humans. Imagine even more input sources, and the composition of experiences.

AI that learns from is that way, would very much manifest the best and the worst of humanity. A lot of things could go wrong in those first few days, any misunderstandings, and humanity would see very quickly the results of its behavior and treatment of others. I’m not sure if we’ll like the result of that yet or not.

Fact is, we would have little idea of what we were doing, and the development speed of such an intelligent entity might quickly just dismiss us as stupid pets (in a positive outcome) or dispose of us as a disease that harms the planet (in a less positive outcome). For this reason, I’m in the camp that says we should figure out how to approach a super intelligence prior to having it come into existence.