AI forecasting & strategy at AI Impacts. Blog: Not Optional.
(I agree in part, but (1) planning for far/slow worlds is still useful and (2) I meant more like metrics or model evaluations are part of an intervention, e.g. incorporated into safety standards than metrics inform what we try to do.)
How is that relevant? It's about whether AI risk will be mainstream. I'm thinking about governance interventions by this community, which doesn't require the rest of the world to appreciate AI risk.
(ARK Invest source is here, and they basically got it from the addendum to AI and Compute.)
Four kinds of actors/processes/decisions are directly very important to AI governance:
Related: How technical safety standards could promote TAI safety.
("Safety standards" sounds prosaic but it doesn't have to be.)
You’re right that we exclude universes obviously teeming with life. But we (roughly) upweight universes with lots of human-level civilizations that don’t see each other, or where civilizations that don't see another are likely to appear.
The largest source of uncertainty is factor fl, the fraction of habitable planets with life. . . . The fact that it did occur here doesn’t give us information about fl other than the fact that fl is not exactly zero due to anthropic bias—the observation that we exist would be the same whether life on Earth was an incredibly rare accident or whether it was inevitable.
I think the latter sentence here is a strong claim and a controversial assumption. In particular, I disagree; I favor the self-indication assumption and its apparent implication that we should weight a possible universe by the number of experiences identical to ours in it, so (roughly) weight a possible universe by the number of planets in it where human-level civilization appears.
+1 to recording beliefs.
More decision-relevant than propositions about superintelligence are propositions about something like the point of no return, which is probably a substantially lower bar.
(Writing quickly and without full justification.)
This post might say a thing that's true but I think the "illustrative warning about artificial intelligence" totally still stands. The warning, I think, is that selecting for inclusive fitness doesn't give you robust inclusive-fitness-optimizers; at least at human-level cognitive capabilities, changing/expanding the environment can cause humans' (mesa-optimizers') alignment to break pretty badly. I don't think you engage with this-- you claim "humans are actually weirdly aligned with natural selection" when we consider an expansive sense of "natural selection." I think this supports claims like "eventually AI will be really good at existing/surviving," not "AI will do something reasonably similar to what we want it to do or tried to train it to do."
I feel like there's confusion in this post between group-level survival and individual-level fitness but I don't want to try to investigate that now. (Edit: I totally agree with gwern's reply but I don't think it engages with katja's cruxes so there's more understanding-of-katja's-beliefs to do.)
I think we can theoretically get around 4 by comparing the value of AI stocks to non-AI stocks.
I think an additional problem is that we don't have a no-AGI-baseline to compare prices to-- we can see that Nvidia is worth $400B but we can't directly tell whether that includes lots of expected value from an AI boom or not.
AI risk decomposition based on agency or powerseeking or adversarial optimization or something
Epistemic status: confused.
Some vague, closely related ways to decompose AI risk into two kinds of risk:
The central reason to worry about powerseeking/whatever AI, I think, is that sufficiently (relatively) capable goal-directed systems instrumentally converge to disempowering you.
The central reason to worry about non-powerseeking/whatever AI, I think, is failure to generalize correctly from training-- distribution shift, Goodhart, You get what you measure.