Communications lead at MIRI. Unless otherwise indicated, my posts and comments here reflect my own views, and not necessarily my employer's.
I'm happy you linkposted this so people could talk about it! The transcript above is extremely error-laden, though, to the extent I'm not sure there's much useful signal here unless you read with extreme care?
I've tried to fix the transcription errors, and posted a revised version at the bottom of this post (minus the first 15 minutes, which are meta/promotion stuff for Bankless). I vote for you copying over the Q&A transcript here so it's available both places.
Do you know of any arguments with a similar style to The Most Important Century that is as pessimistic as EY/MIRI folks (>90% probability of AGI within 15 years)?
Wait, what? Why do you think anyone at MIRI assigns >90% probability to AGI within 15 years? That sounds wildly too confident to me. I know some MIRI people who assign 50% probability to AGI by 2038 or so (similar to Ajeya Cotra's recently updated view), and I believe Eliezer is higher than 50% by 2038, but if you told me that Eliezer told you in a private conversation "90+% within 15 years" I would flatly not believe you.
I don't think timelines have that much to do with why Eliezer and Nate and I are way more pessimistic than the Open Phil crew.
Thanks for posting this, Andrea_Miotti and remember! I noticed a lot of substantive errors in the transcript (and even more errors in vonk's Q&A transcript), so I've posted an edited version of both transcripts. I vote that you edit your own post to include the revisions I made.
Here's a small sample of the edits I made, focusing on ones where someone may have come away from your transcript with a wrong interpretation or important missing information (as opposed to, e.g., the sentences that are just very hard to parse in the original transcript because too many filler words and false starts to sentences were left in):
Gratitude to Andrea_Miotti, remember, and vonk for posting more-timely transcripts of this so LW could talk about it at the time -- and for providing a v1 transcript to give me a head start.
Here's a small sample of the edits I made to the previous Bankless transcript on LW, focusing on ones where someone may have come away from the original transcript with a wrong interpretation or important missing information (as opposed to, e.g., the sentences that are just very hard to parse in the original transcript because too many filler words and false starts to sentences were left in):
The Q&A transcript on LW is drastically worse, to the point that it might well reduce the net accuracy of readers' beliefs if they aren't careful? I won't try to summarize all the important fixes I made to that transcript, because there are so many. I also cut out the first 15 minutes of the Q&A, which are Eliezerless and mostly consist of Bankless ads and announcements.
But this seems to contradict the element of Non-Deception. If you're not actually on the same side as the people who disagree with you, why would you (as a very strong but defeasible default) role-play otherwise?
This is a good question!! Note that in the original footnote in my post, "on the same side" is a hyperlink going to a comment by Val:
"Some version of civility and/or friendliness and/or a spirit of camaraderie and goodwill seems like a useful ingredient in many discussions. I'm not sure how best to achieve this in ways that are emotionally honest ('pretending to be cheerful and warm when you don't feel that way' sounds like the wrong move to me), or how to achieve this without steering away from candor, openness, 'realness', etc."
I think the core thing here is same-sidedness.
That has nothing to do directly with being friendly/civil/etc., although it'll probably naturally result in friendliness/etc.
(Like you seem to, I think aiming for cheerfulness/warmth/etc. is rather a bad idea.)
If you & I are arguing but there's a common-knowledge undercurrent of same-sidedness, then even impassioned and cutting remarks are pretty easy to take in stride. "No, you're being stupid here, this is what we've got to attend to" doesn't get taken as an actual personal attack because the underlying feeling is of cooperation. Not totally unlike when affectionate friends say things like "You're such a jerk."
This is totally different from creating comfort. I think lots of folk get this one confused. Your comfort is none of my business, and vice versa. If I can keep that straight while coming from a same-sided POV, and if you do something similar, then it's easy to argue and listen both in good faith.
I think this is one piece of the puzzle. I think another piece is some version of "being on the same side in this sense doesn't entail agreeing about the relevant facts; the goal isn't to trick people into thinking your disagreements are small, it's to make typical disagreements feel less like battles between warring armies".
I don't think this grounds out in simple mathematics that transcends brain architecture, but I wouldn't be surprised if it grounds out in pretty simple and general facts about how human brains happen to work. (I do think the principle being proposed here hasn't been stated super clearly, and hasn't been argued for super clearly either, and until that changes it should be contested and argued about rather than taken fully for granted.)
But why should we err at all? Should we not, rather, use as many carrots and sticks as is optimal?
"Err on the side of X" here doesn't mean "prefer erring over optimality"; it means "prefer errors in direction X over errors in the other direction". This is still vague, since it doesn't say how much to care about this difference; but it's not trivial advice (or trivially mistaken).
so when I see the brand name being used to market a particular set of discourse norms without a clear explanation of how these norms are derived from the law, that bothers me enough to quickly write an essay or two about it
Seems great to me! I share your intuition that Goodwill seems a bit odd to include. I think it's right to push back on proposed norms like these and talk about how justified they are, and I hope my list can be the start of a conversation like that rather than the end.
I do have an intuition that Goodwill, or something similar to Goodwill, plays an important role in the vast majority of human discourse that reliably produces truth. But I'm not sure why; if I knew very crisply what was going on here, maybe I could reduce it to other rules that are simpler and more universal.
Basically the fact LW has far more arguments for "alignment will be hard" compared to alignment being easy is the selection effect I'm talking about.
That could either be 'we're selecting for good arguments, and the good arguments point toward alignment being hard', or it could be a non-epistemic selection effect.
Why do you think it's a non-epistemic selection effect? It's easier to find arguments for 'the Earth is round' than 'the Earth is flat', but that doesn't demonstrate a non-epistemic bias.
I was also worried because ML people don't really think that AGI poses an existential risk, and that's evidence, in an Aumann sense.
... By 'an Aumann sense' do you just mean 'if you know nothing about a brain, then knowing it believes P is some Bayesian evidence for the truth of P'? That seems like a very weird way to use "Aumann", but if that's what you mean then sure. It's trivial evidence to anyone who's spent much time poking at the details, but it's evidence.
I think a more likely thing we'd want to stick around to do in that world is 'try to accelerate humanity to AGI ASAP'. "Sufficiently advanced AGI converges to human-friendly values" is weaker than "AGI will just have human-friendly values by default".
The verbatim statement is:
When he says "cryptographical systems", he's clarifying what he meant by "crypto" in the previous few clauses (this is a bit clearer from the video, where you can hear his tone). He often says stuff like this about cryptography and computer security; e.g., see the article Eliezer wrote on Arbital called Show me what you've broken:
See also So Far: Unfriendly AI Edition:
And Security Mindset and Ordinary Paranoia.