44. In Defense of Boring Doom: A Pessimist's Case for Better Pessimism

(With thanks and apologies to JSW.)

(Epistemic status: Prediction is very difficult, especially about the future; nonetheless, I shall not be stopped from pondering my orb. Also, I got Claude to write the title. It felt apposite. Thank you, Claude.)

Seems like everyone who thinks they're anyone has takes on timelines, p(Doom), and vaguely gearsy mechanisms of action, all wrapped up into a big sad and/or deluded bundle of prediction. In the interest of doing my work in public, here's my crack at that.

To get the boring numbers only mostly pulled out of my tail out of the way, my p(Doom) is somewhere between 65% and 90% on any given day. I think there's about a 6% chance of AGI by 2030; conditional on that, I think we're nearly certain (~98%) to fall into some flavor of ruin as a global civilization. I think there's something like an 82% chance of AGI by (say) 2055; conditional on that, I think it's more likely than not (~60%) that we fall into similar classes of ruin.

By "doom" or "ruin" here, by the way, I mean some kind of incredibly unambiguously unhappy outcome where many billions of humans die, suffer terribly, and/or are permanently disempowered; the natural world is damaged beyond easy repair; and/or human society is irrevocably changed for the worse. Some examples: everyone dies, deliberately or as collateral damage; a few dozen people control all but a tiny fraction of wealth, power, and capacity for violence - much more so than today; life as usual sort of continues but risks like AI-supercharged terror attacks and getting got by personally targeted psychological attacks are a fact of life. Some nonexamples: we get the Shining Transhuman Future (obviously); AGI becomes a fact of life but ASI turns out to be impossible for some reason; business as usual approximately continues but humanity never takes the stars (which would be sad but not count as ruinous).

So why do I feel such dread about that 1-in-16 shot at AGI in the near term? I think that in such worlds, the reason we got AGI before 2030 was that LLMs + scale in data and compute + maybe some kind of transformer + maybe a minor breakthrough in sample efficiency or quality synthetic data turned out to be all you need - as would make my beliefs and models badly mistaken - and that that just-pre-AGI research feedbacked to proper AGI in short order, and then likely to ASI shortly thereafter. This would have to come coupled with a continuation - or acceleration - of the current unsustainable-seeming VC investment in AI, and in the current corporate and geopolitical race to AGI, in which safety concerns would probably have been mostly sold off for speed. Worst of all, we don't live in a world especially well set up to care about the well-being of those beings of no particular economic value, from horseflies to the homeless. In a fast AGI world, we'd most of us be joining them in the pits. And that's if the nascent AGI is a reasonably friendly one! An unfriendly one would be getting busy sparking a countervalue exchange between Washington and Beijing; an unaligned or improperly aligned one might pursue strange proxies for human flourishing or commandeer the productive economy to copy itself, both at the expense of human existence.

Thankfully, I don't think that super-duper-LLMs are particularly likely to produce AGI, especially in the near term. There are several weak and nebulous reasons why I think so, but the most important reasons I see are that the universe of existing language - including code - is far too neat as compared to the unenumerable variety of physical things or even possible words, and mastery of langauge is far too impressive to humans compared to actual intelligence or reasoning ability (some things cannot be spoken of, not because they’re secret, but because words have thus far failed to capture them); that the training data for physical-world applications is far too sparse to train on for now; and that we haven't yet seen much in the way of post-finetuning or generally out-of-context learning. (And that's if the overheated AI economy doesn't become a bubble in hindsight, popping around mid-2027. Calling it now. Fingers crossed for another Winter to buy time for the governance folks buying time for the technical crew buying time for the governance folks.)

What of the more distant future, then? In the 3-in-4 worlds where AGI arrives between 2030 and 2060, I still think that our world is not currently set up to deliver good outcomes to a significant portion of living humans, a state of affairs unlikely to change, given the givens. First off, the same argument from replaceability and thus dispensability still applies; even the promise of a UBI would only speed disempowerment and would surely be rescinded once automated security and surveillance forces cut the legs out from under the leverage of rebellion. Worse yet, that would be sufficient time for subtler forms of disempowerment to come into play: microtargeted propaganda and superpersuasion would be just as lethal to global society as drones with hand grenades or synthetic mosquitoes bearing poison, if much slower; and even if the AGI is neither violent nor manipulative, the permanent loss of labor skills subsumed by automation and equally permanent loss of the career pyramid from the bottom up would leave human society exquisitely fragile to disruptions to chip manufacture and telecommunications, to say nothing of interstate squabbles amplified by dueling AGI, nor the effects of naturally increasing concentration of power and wealth on the kind of socioeconomic system we will find ourselves subject to. These numerous roads to ruin need not even overlap - they don't - and many existing proposals with longer time horizons fall into the trap of pushing probability mass around from one road to another, rather than on reducing it entirely.

What of a solution focused on governance, policy, and control? Here's the major issue which I think torpedoes the entire class of strategies of the general form "get nation-state and corporate actors to agree to strategic limitation treaties": not only would AGI be a decisive strategic advantage, but it would suffer from the difficulties of both bioweapons and nuclear arms. Like bioweapons, it would need to be very carefully sequestered, kept safely out of contact with the public lest its influence spread or some bad actor misuse it - but that black box had better be extremely black, and that doesn't seem plausible in a world with major economic pressures towards automating everything that can be automated, which in turn requires deep and widespread engagement. Like nuclear arms, it would provide a decisive strategic advantage while running a substantial existential risk even to its creators - but its economic benefits far outstrip even "atoms for peace"-style fission power and manufacture of medical isotopes, and any defection from such a limitation treaty would reap easily tens of trillions of USD in economic value. Even if we suppose that nation-states would break from international anarchy in the face of a true existential threat, economic pressures would drive frontier labs to seek wealth and power in their place. A Pause would not hold up; some quietly enterprising soul would become the nearest unblocked path without unconscionable and unsustainable levels of surveillance and control.

Neither would a purely technical and philosophical solution to AI alignment be sufficient to win the day. Just to start on, there's the crucial question of who, exactly, the AGI is aligned to. Pick a single person and they become an unchallenged emperor; pick a small group and they become an oligarchy, a cabal unto themselves; pick the wrong group and cause billions to suffer under values alien to them; pick a nation-state or a corporate entity and hand them the lightcone on a platter. It's probably too late to avoid handing control to such an already-powerful entity, too: as the late Bill Thurston (who was right about approximately everything) observed, to be funded by a military entity is to be beholden to it, and to be managed by it in some dubiously aligned way, and to grant it broad access, and ultimately to undergo a chilling effect on dissent. Then there's the issue of the alignment tax: a friendly AGI would necessarily be less effective than an unaligned one, because some tactics available to an unaligned AGI would not be available to a friendly one - most notably, maximally aggressive exfiltration and replication. Such an argument holds for risks from paperclippers, locusts, and griefers, all three.

Indeed, the most workable-looking solution would have three primary legs: a social one, a technical one, and a diplomatic one: it would require a restructuring of society to become more equitable and just, the better to avoid pushing anyone into the kind of pit where they have little to lose from unleashing hell; it would require a workable and self-correcting solution to alignment, along with a working theory of post-scarcity economic value and systems, the better to ; and it would require the end of the nation-state as we know it, for a world at war with itself is a house divided, and shall not stand under arbitrary pressures from ASI. It might even benefit from transhuman measures, like widespread germline intelligence amplification. But we won't get any of that, let alone all of it; every week I have two or three conversations which all beat the same drum - that we must give up on a good solution in favor of a merely adequate one; let the billionaires become sovereign trillionaires, let nation-states do as they like to the masses, let unaligned AI do what it will. Rubbish and rot, but we are all of us eating from the garbage can of ideology.

(Alternative titles: "Fine, Fine, I'll Write Up My Take On AI Risks"; "Risk of Ruin 3 (2027)"; "Giving Your Timeline and p(Doom) is the Fashionable Thing These Days, Right?")

(The best of Claude's alternative titles: "My Contractually Obligated AI Risk Take (No One Asked For This)"; "Timelines, p(Doom), and Other Metrics We've Stopped Taking Seriously"; "I Regret to Inform You My p(Doom) Has Footnotes")

Search This Blog

Tiled With Pentagons

44. In Defense of Boring Doom: A Pessimist's Case for Better Pessimism

Comments

Post a Comment

Popular posts from this blog

20. A Sketch of Helpfulness Theory With Equivocal Principals

4. Seven-ish Words from My Thought-Language

11. Why the First “High Dimension” is Six or Maybe Five