Wiki Contributions


I think this is a very important distinction. I prefer to use "maximizer" for "timelessly" finding the highest value of an objective function, and reserve "optimizer" for the kind of stepwise improvement discussed in this post. As I use the terms, to maximize something is to find the state with the highest value, but to optimize it is to take an initial state and find a new state with a higher value. I recognize that "optimize" and "optimizer" are sometimes used the way you're saying, as basically synonymous with "maximize" / "maximizer", and I could retreat to calling the inherently temporal thing I'm talking about an "improver" (or an "improvement process" if I don't want to reify it), but this actually seems less likely to be quickly understood, and I don't think it's all that useful for "optimize" and "maximize" to mean exactly the same thing.

(There is a subset of optimizers as I (and this post, although I think the value should be graded rather than binary) use the term that in the limit reach the maximum, and a subset of those that even reach the maximum in a finite number of steps, but optimizers that e.g. get stuck in local maxima aren't IMO thereby not actually optimizers, even though they aren't maximizers in any useful sense.)

Good post; this has way more value per minute spent reading and understanding it than the first 6 chapters of Jaynes, IMO.

There were 20 destroyed walls and 37 intact walls, leading to 10 − 3×20 − 1×37 = 13db

This appears to have an error; 10 − 3×20 − 1×37 = 10 - 60 - 37 = -87, not 13. I think you meant for the 37 to be positive, in which case 10 - 60 + 37 = -13, and the sign is reversed because of how you phrased which hypothesis the evidence favors (although you could also just reverse all the signs if you want the arithmetic to come out perfectly).

Also, nitpick, but

and every 3 db of evidence increases the odds by a factor of 2

should have an "about" in it, since 10^(3/10) is ~1.99526231497, not 2. (3db ≈ 2× is a very useful approximation, and implied by 10^3 ≈ 2^10, but encountering it indirectly like this would be very confusing to anyone who isn't already familiar with it.)

I re-read this, and wanted to strong-upvote it, and was disappointed that I already had. This is REALLY good. Way better than the thing it parodies (which was already quite good). I wish it were 10x as long.

The way that LLM tokenization represents numbers is all kinds of stupid. It's honestly kind of amazing to me they don't make even more arithmetic errors. Of course, an LLM can use a calculator just fine, and this is an extremely obvious way to enhance its general intelligence. I believe "give the LLM a calculator" is in fact being used, in some cases, but either the LLM or some shell around it has to decide when to use the calculator and how to use the calculator's result. That apparently didn't happen or didn't work properly in this case.

Thanks for your reply. "70% confidence that... we have a shot" is slightly ambiguous - I'd say that most shots one has are missed, but I'm guessing that isn't what you meant, and that you instead meant 70% chance of success.

70% feels way too high to me, but I do find it quite plausible that calling it a rounding error is wrong. However, with a 20 year timeline, a lot of people I care about will almost definitely still die, who could have not died if death were Solved, which group with very much not negligible probability includes myself. And as you note downthread, the brain is a really deep problem with prosaic life extension. Overall I don't see how anything along these lines can be fast enough and certain enough to be a crux on AI for me, but I'm glad people are working on it more than is immediately apparent to the casual observer. (I'm a type 1 diabetic and would have died at 8 years old if I'd lived before insulin was discovered and made medically available, so the value of prosaic life extension is very much not lost on me.)

P.S. Having this set of values and beliefs is very hard on one's epistemics. I think it's a writ-large version of what Eliezer has stated as "thinking about AI timelines is bad for one's epistemics". Here are some examples:

(1) Although I've never been at all tempted by e/acc techno-optimism (on this topic specifically) / alignment isn't a problem at all / alignment by default, boy, it sure would be nice to hear about a strategy for alignment that didn't sound almost definitely doomed for one reason or another. Even though Eliezer can (accurately, IMO) shoot down a couple of new alignment strategies before getting out of bed in the morning. So far I've never found myself actually doing it, but it's impossible not to notice that if I just weren't as good at finding problems or as willing to acknowledge problems found by others, then some alignment strategies I've seen might have looked non-doomed, at least at first...

(2) I don't expect any kind of deliberate slowdown of making AGI to be all that effective even on its own terms, with the single exception of indiscriminate "tear it all down", which I think is unlikely to get within the Overton window, at least in a robust way that would stop development even in countries that don't agree (forcing someone to sabotage / invade / bomb them). Although such actions might buy us a few years, it seems overdetermined to me that they still leave us doomed, and in fact they appear to cut away some of the actually-helpful options that might otherwise be available (the current crop of companies attempting to develop AGI definitely aren't the least concerned with existential risk of all actors who'd develop AGI if they could, for one thing). Compute thresholds of any kind, in particular, I expect to lead to much greater focus on doing more with the same compute resources rather than doing more by using more compute resources, and I expect there's a lot of low-hanging fruit there since that isn't where people have been focusing, and that the thresholds would need to decrease very much very fast to actually prevent AGI, and decreasing the thresholds below the power of a 2023 gaming rig is untenable. I'm not aware of any place in this argument where I'm allowing "if deliberate slowdowns were effective on their own terms, I'd still consider the result very bad" to bias my judgment. But is it? I can't really prove it isn't...

(3) The "pivotal act" framing seems unhelpful to me. It seems strongly impossible to me for humans to make an AI that's able to pass strawberry alignment that has so little understanding of agency that it couldn't, if it wanted to, seize control of the world. (That kind of AI is probably logically possible, but I don't think humans have any real possibility of building one.) An AI that can't even pass strawberry alignment clearly can't be safely handed "melt all the GPUs" or any other task that requires strongly superhuman capabilities (and if "melt all the GPUs" were a good idea, and it didn't require strongly superhuman capabilities, then people should just directly do that). So, it seems to me that the only good result that could come from aiming for a pivotal act would be that the ASI you're using to execute it is actually aligned with humans and "goes rogue" to implement our glorious transhuman future; and it seems to me that if that's what you want, it would be better to aim for that directly rather than trying to fit it through this weirdly-shaped "pivotal act" hole.

But... if this is wrong, and a narrow AGI could safely do a pivotal act, I'd very likely consider the resulting world very bad anyway, because we'd be in a world where unaligned ASI has been reliably prevented from coming into existence, and if the way that was done wasn't by already having aligned ASI, then by far the obvious way for that to happen is to reliably prevent any ASI from coming into existence. But IMO we need aligned ASI to solve death. Does any of that affect how compelling I find the case for narrow pivotal-act AI on its own terms? Who knows...

I agree with the Statement. As strongly as I can agree with anything. I think the hope of current humans achieving... if not immortality, then very substantially increased longevity... without AI doing the work for us, is at most a rounding error. And ASI that was even close to aligned, that found it worth reserving even a billionth part of the value of the universe for humans, would treat this as the obvious most urgent problem and solve death pretty much if there's any physically possible way of doing so. And when I look inside, I find that I simply don't care about a glorious transhumanist future that doesn't include me or any of the particular other humans I care about. I do somewhat prefer being kind / helpful / benificent to people I've never met, very slightly prefer that even for people who don't exist yet, but it's far too weak a preference to trade off against any noticeable change to the odds of me and everyone I care about dying. If that makes me a "sociopath" in the view of someone or other, oh well.

I've been a supporter of MIRI, AI alignment, etc. for a long time, not because I share that much with EA in terms of values, but because the path to the future having any value has seemed for a long time to route through our building aligned ASI, which I consider as hard as MIRI does. But when the "pivotal act" framing started being discussed, rather than actually aligning ASI, I noticed a crack developing between my values and MIRI's, and the past year with advocacy for "shut it all down" and so on has blown that crack wide open. I no longer feel like a future I value has any group trying to pursue it. Everyone outside of AI alignment is either just confused and flailing around with unpredictable effects, or is badly mistaken and actively pushing towards turning us all into paperclips, but those in AI alignment are either extremely unrealistically optimistic about plans that I'm pretty sure, for reasons that MIRI has argued, won't work; or, like current MIRI, they say things like that I should stake my personal presence in the glorious transhumanist future on cryonics (and what of my friends and family members who I could never convince to sign up? What of the fact that, IMO, current cryonics practice probably doesn't even prevent info-theoretical death, let alone give one a good shot at actually being revived at some point in the future?)

I happen to also think that most plans for preventing ASI from happening soon, that aren't "shut it all down" in a very indiscriminate way, just won't work - that is, I think we'll get ASI (and probably all die) pretty soon anyway. And I think "shut it all down" is very unlikely to be societally selected as our plan for how to deal with AI in the near term, let alone effectively implemented. There are forms of certain actors choosing to go slower on their paths to ASI that I would support, but only if those actors are doing that specifically to attempt to solve alignment before ASI, and only if it won't slow them down so much that someone else just makes unaligned ASI first anyway. And of course we should forcibly stop anyone who is on the path to making ASI without even trying to align it (because they're mistaken about the default result of building ASI without aligning it, or because they think humanity's extinction is good actually), although I'm not sure how capable we are of stopping them. But I want an organization that is facing up to the real, tremendous difficulty of making the first ASI aligned, and trying to do that anyway, because no other option actually has a result that they (or I) find acceptable. (By the way, MIRI is right that "do your alignment homework for you" is probably the literal worst possible task to give to one's newly developed AGI, so e.g. OpenAI's alignment plan seems deeply delusional to me and thus OpenAI is not the org for which I'm looking.)

I'd like someone from MIRI to read this. If no one replies here, I may send them a copy, or something based on this.

Yes he should disclose somewhere that he's doing this, but deepfakes with the happy participation of the person whose voice is being faked seems like the best possible scenario.

Yes and no. The main mode of harm we generally imagine is to the person deepfaked. However, nothing prevents the main harm in a particular incident of harmful deepfaking from being to the people who see the deep fake and believe the person depicted actually said and did the things depicted.

That appears to be the implicit allegation here - that recipients might be deceived into thinking Adams actually speaks their language (at least well enough to record a robocall). Or at least, if that's not it, then I don't get it either.

I've seen a lot of attempts to provide "translations" from one domain-specific computer language to another, and they almost always have at least one of these properties:

  1. They aren't invertible, nor "almost invertible" via normalization
  2. They rely on an extension mechanism intentionally allowing the embedding of arbitrary data into the target language
  3. They use hacks (structured comments, or even uglier encodings if there aren't any comments) to embed arbitrary data
  4. They require the source of the translation to be normalized before (and sometimes also after, but always before) translation

(2) and (3) I don't think are super great here. If there are blobs of data in the translated version that I can't understand, but that are necessary for the original sender to interpret the statement, it isn't clear how I can manipulate the translated version while keeping all the blobs correct. Plus, as the recipient, I don't really want to be responsible for safely maintaining and manipulating these blobs.

(1) is clearly unworkable (if there's no way to translate back into the original language, there can't be a conversation). That leaves 4. 4 requires stripping anything that can't be represented in an invertible way before translating. E.g., if I have lists but you can only understand sets, and assuming no nesting, I may need to sort my list and remove duplicates from it as part of normalization. This deletes real information! It's information that the other language isn't prepared to handle, so it needs to be removed before sending. This is better than sending the information in a way that the other party won't preserve even when performing only operations they consider valid.

I think this applies to the example from the post, too - how would I know whether certain instances of double negation or provability were artifacts that normalization is supposed to strip, or just places where someone wanted to make a statement about double negation or provability?

Malbolge? Or something even nastier in a similar vein, since it seems like people actually figured out (with great effort) how to write programs in Malbolge. Maybe encrypt all the memory after every instruction, and use a real encryption algorithm, not a lookup table.

Load More