Paperclip maximizer with extra steps

Yudkowsky provides a very approachable explanation of how AI training is done, and the alignment problem in general, and for those reasons the book deserves a rating above 1.

The problem with the book is that the main premise of it – Why Superhuman AI Would Kill Us All – is asserted but not answered. The book correctly illustrates how AI is “grown”, and how nobody actually understands how an AI “thinks”, and why that is a problem. It also does a good job at illustrating how during its training, regardless of what the purpose of that training is, an AI will naturally develop different “wants” that might be misaligned with its creators. The authors then leap from that premise to how said AI will naturally, through the pursuit of those “wants”, kill all life on the planet. There isn’t any room left for probability or that the AI might do that, in Yudkowsky’s mind whatever weird goals or “wants” the superintelligent AI will develop, it will always, 100% guaranteed, kill all humans in its pursuit of those.

It is kind of astonishing how in one chapter the authors explain things by parallels to human behavior, and how we overcome our genes in pursuit of ice cream, yet no such grace is given to the AI that’s presumably smarter than us, nor no reflection is made how us dumb humans in their pursuit of ice cream have not turned all biomass on the planet into ice cream either, at least not yet. Yudkowsky simply cannot escape a purely adversarial relationship with AI, nor can he think about the world in any terms that aren’t a zero-sum game, and we know the world at large is not a zero-sum game because the completely soulless process of evolution somehow managed to produce altruism in various creatures. I suppose I should not be particularly surprised about such obvious blind spots coming from the founder of LessWrong.

I’m not a proponent of AI in the slightest, but I remain unconvinced about the dangers the book presents, or at very least the absolute certainty of those danagers. The much more realistic apocalypse we’re facing right now is climate change, a problem that the whole AI datacenter craze with its absurd water and energy demands is currently adding to, that and the financial bubble we’re in that threathens us with a massive global recession are reason enough to rethink what we’re doing with AI. And no, we do not need a superintelligent AI to figure out how to solve climate change, we’ve already figured out how to solve it, we just need to start implementing the solutions.

If you don’t know anything about AI, there is some value in picking up this book. If you already know about gradient descent, I may suggest your time might be much better spent on Yudkowsky’s Harry Potter fanfic, at least that one is funny.