As I said it's ridiculous to think someone either in the Google or OAI camp won't have more than 1 billion USD in training hardware, in service for a single model (training many instances in parallel) by openAI.
I think you're reading this condition incorrectly. The $1 billion would need to be spent for a single model. If OpenAI buys a $2 billion supercomputer but they train 10 models with it, that won't necessarily qualify.
I suspect the MMLU and the MATH milestones are the easiest to achieve. I suspect it will probably happen after a GPT-4-level model is specialized to perform well in mathematics like Minerva.
I think you're overconfident here. I'm quite skeptical that GPT-4 already got above 80% on every single task in the MMLU since there are 57 tasks and it got 86.4% on average. I'm also skeptical that OpenAI will very soon spend >$1 billion to train a single model, but I definitely don't think that's implausible. "Almost certain" for either of those seems wrong.
Assuming the British government gets a fair price for the hardware, and actually has the machine running prior to the bet end date, does this satisfy the condition?
That condition resolves on the basis of the cost of the training run, not the cost of the hardware. You can tell because we spelled out the full details of how to estimate costs, and it depends on the cost in FLOP for the training run.
But honestly at this point I'm considering conceding early and just paying out, because I don't look forward to years of people declaring victory early, which seems to already be happening.
To be clear, I think I will lose, but I think this is weak evidence. The bet says that $1bn must be spent on a single training run, not a single supercomputer.
Did they reveal how GPT-4 did on every task in the MMLU? If not, it's not clear whether the relevant condition here has been met yet.
Well, to be fair, I don't think many people realized how weak some of these benchmarks were. It is hard to tell without digging into the details, which I regrettably did not either.
I'm not sure. It depends greatly on the rate of general algorithmic progress, which I think is unknown at this time. I think it is not implausible (>10% chance) that we will see draconian controls that limit GPU production and usage, decreasing effective compute available to the largest actors by more than 99% from the trajectory under laissez faire. Such controls would be unprecedented in human history, but justified on the merits, if AI is both transformative and highly dangerous.
It should be noted that, to the extent that more hardware allows for more algorithmic experimentation, such controls would also slow down algorithmic progress.
What is your source for the claim that effective compute for AI is doubling more than once per year? And do you mean effective compute in the largest training runs, or effective compute available in the world more generally?
A retrospective on this bet:
Having thought about each of these milestones more carefully, and having already updated towards short timelines months ago, I think it was really bad in hindsight to make this bet, even on medium-to-long timeline views. Honestly, I'm surprised more people didn't want to bet us, since anyone familiar with the relevant benchmarks probably could have noticed that we were making quite poor predictions.
I'll explain what I mean by going through each of these milestones individually,
Having not read the detailed results yet, I would be quite surprised if [Gato] performed better on language-only tasks than a pretrained language model of the same size...
In general, from a "timelines to risky systems" perspective, I'm not that interested in these sorts of "generic agents" that can do all the things with one neural net; it seems like it will be far more economically useful to have separate neural nets doing each of the things and using each other as tools to accomplish particular tasks and so that's what I expect to see.
Do you still believ...
Can you provide an example (without naming people)?
Baumol-effect jobs where it is essential (or strongly preferred) that the person performing the task is actually a human being. So: therapist, tutor, childcare, that sort of thing
Huh. Therapists and tutors seem automatable within a few years. I expect some people will always prefer an in-person experience with a real human, but if the price is too high, people are just going to talk to a language model instead.
However, I agree that childcare does seem like it's the type of thing that will be hard to automate.
My list of hard to automate jobs would probably include things like: plumber, carpet installer, and construction work.
I'm happy that Scott Sumner commented. I think his analysis is reasonable, and I roughly agree with what he said. My only major complaint is that I think he might have misread the extent to which my article was intended as a criticism of his policy recommendations as opposed to Eliezer's specific commentary. I think it's plausible that the new monetary policy had a modest but positive counterfactual impact on RGDP over several years. I just don't think that's the impression Eliezer gave in the book when he provided the example.
I think your critiques are great since you're thinking clearly about how this approach is supposed to work. At a high level my reply to your comment is something like, "I basically agree, but don't think that anything you mentioned is devastating. I'm trying to build something that is better than Bio Anchors, and I think I probably succeeded even with all these flaws."
That said, I'll address your points more directly.
My understanding is that the irreducible part of the loss has nothing (necessarily) to do with "entropy of natural text" and even less with "
training on whichever distribution does give human-level reasoning might have substantially different scaling regularities.
I agree again. I talked a little bit about this at the end of my post, but overall I just don't have any data for scaling laws on better distributions than the one in the Chinchilla paper. I'd love to know the scaling properties of training on scientific tasks and incorporate that into the model, but I just don't have anything like that right now.
Also, this post is more about the method rather than any conclusions I may have drawn. I hope this model can be updated with better data some day.
In the notebook, the number of FLOP to train TAI is deduced a priori. I basically just estimated distributions over the relevant parameters by asking what I'd expect from TAI, rather than taking into consideration whether those values would imply a final distribution that predicts TAI arrived in the past. It may be worth noting that that Bio Anchors also does this initially, but it performs an update by chopping off some probability from the distribution and then renormalizing. I didn't do that yet because I don't know how to best perform the update.
Personally, I don't think a 12% chance that TAI already arrived is that bad, given that the model is deduced a priori. Others could reasonably disagree though.
But most science requires actually looking at the world. The reason we spend so much money on scientific equipment is because we need to check if our ideas correspond to reality, and we can't do that just by reading text.
I agree. The primary thing I'm aiming to predict using this model is when LLMs will be capable of performing human-level reasoning/thinking reliably over long sequences. It could still be true that, even if we had models that did that, they wouldn't immediately have a large scientific/economic impact on the world, since science requires a ...
I'll just note that the NGDP growth from 2013 to 2017 (when Inadequate Equilibria was published) was about 2% per year whereas RGDP went up by about 1% per year. This definitely makes me sympathetic to "they didn't go far enough" but I'm still not sympathetic to "they never tested my theory" since you'd still expect some noticeable large effects from the new policy if the old monetary policy was responsible for a multi-trillion real-dollar problem.
It sounds like you are saying that he was making claims about 2.
No, I think he was also wrong about the Bank of Japan's relative competence. I didn't argue this next point directly in the post because it would have been harder to argue than the other points I made, but I think Eliezer is just straight up wrong that the Bank of Japan was pursuing a policy prior to 2013 that made Japan forgo trillions of dollars in lost economic growth.
To be clear, I don't think that the Bank of Japan was following the optimal monetary policy by any means, and I curren...
I think the claims of the book along the lines of the following quote were definitely undermined in light of this factual error,
We have a picture of the world where it is perfectly plausible for an econblogger to write up a good analysis of what the Bank of Japan is doing wrong, and for a sophisticated reader to reasonably agree that the analysis seems decisive, without a deep agonizing episode of Dunning-Kruger-inspired self-doubt playing any important role in the analysis.
In particular, I think this error highlights that even sophisticated observers can ...
It does seem plausible that the Bank of Japan thing was an error. However, I don't think that would undermine his thesis.
I agree that this error does not substantially undermine the entire book, much less prove its central thesis false. I still broadly agree with most of the main claims of the book, as I understand them.
I disagree. Elsewhere in the chapter he says,
How likely is it that an entire country—one of the world’s most advanced countries—would forego trillions of dollars of real economic growth because their monetary controllers—not politicians, but appointees from the professional elite—were doing something so wrong that even a non-professional could tell? How likely is it that a non-professional could not just suspect that the Bank of Japan was doing something badly wrong, but be confident in that assessment?
and later he says,
Roughly, the civilizational inadequa
That's fair. FWIW, I don't follow monetary policy very closely, but I usually see people talking about unemployment, price levels, and the general labor force participation rate in these discussions, not prime age labor force participation rate. The Bank of Japan's website has a page called "Outline of monetary policy" and it states,
The Bank of Japan, as the central bank of Japan, decides and implements monetary policy with the aim of maintaining price1 stability.
Price stability is important because it provides the foundation for the nation's economic acti
The Bank of Japan never carried out the policies that Eliezer favored
I regard this claim as unproven. I think it's clear the Bank of Japan (BOJ) began a new monetary policy in 2013 to greatly increase the money supply, with the intended effect to spur significant inflation. What's unclear to me is whether this policy matched the exact prescription that Eliezer would have suggested; it seems plausible that he would say the BOJ didn't go far enough. "They didn't go far enough" seems a bit different than "they never tested my theory" though.
Perhaps explain your story in more detail. Others might find it interesting.
Yes, monetary policy didn't become loose enough to create meaningful inflation. That doesn't by itself imply that monetary policy didn't become loose, because the theory of inflation here (monetarism) could be wrong. Nonetheless, I think your summary is only slightly misleading.
You could swap in an alternative phrasing that clarifies that I merely demonstrated that the rate of inflation was low, and then the summary would seem adequate to me.
I have one nitpick with your summary.
Now, at time3, you are looking back at Japan's economy and saying that it didn't actually do especially well at that time, and also that it's monetary policy never actually became all that loose.
I'm not actually sure whether Japan's monetary policy became substantially looser after 2013, nor did I claim that this did not occur. I didn't look into this question deeply, mostly because when I started looking into it I quickly realized that it might take a lot of work to analyze thoroughly, and it didn't seem like an essential thesis to prove either way.
I previously thought you were saying something very different with (2), since the text in the OP seems pretty different.
FWIW I don't think you're getting things wrong here. I also have simply changed some of my views in the meantime.
That said, I think what I was trying to accomplish with (2) was not that alignment would be hard per se, but that it would be hard to get an AI to do very high-skill tasks in general, which included aligning the model, since otherwise it's not really "doing the task" (though as I said, I don't currently stand by what I wrote in the OP, as-is).
I think I understand my confusion, at least a bit better than before. Here's how I'd summarize what happened.
I had three arguments in this essay, which I thought of as roughly having the following form:
You said that (2) was already answere...
Sorry for replying to this comment 2 years late, but I wanted to discuss this part of your reasoning,
Fwiw, the problem I think is hard is "how to make models do stuff that is actually what we want, rather than only seeming like what we want, or only initially what we want until the model does something completely different like taking over the world".
I think that's what I meant when I said "I think it will be hard to figure out how to actually make models do stuff we want". But more importantly, I think that's how most people will in fact perceive what it ...
I didn't downvote, but I think the comment would have benefitted from specific commentary about which parts were uncivil. There's a lot of stuff in the post, and most of it has pretty neutral language.
Note: I deleted and re-posted this comment since I felt it was missing key context and I was misinterpreting you previously.
What I specifically said is that in isolation, the graph we have been discussing better fits the SMTM hypothesis than your hypothesis. Bringing in a separate graph that you think better supports your hypothesis than SMTM's has zero bearing on the claim that I made, which is exclusively and entirely about the one graph we have been discussing. This new comment with this new graph reads to me as changing the subject, not making a rebutt
ETA: I misinterpreted the above comment. I thought they were talking about the data, rather than the specific graph. See discussion below.
My visual inspection makes me think that, in isolation, the graph better fits the SMTM hypothesis than your hypothesis
And I'm quite confused by that, because of the chart below (and the other ones for different demographic groups). I am not saying that this single fact proves much in isolation. It doesn't disprove SMTM, for sure. But when I read your qualitative description of the shift that we're supposed to find in thi...
I agree that whatever happened in ~1980 could have been a minor part of a longer-term trend, but if it's not, if there was some contamination that put us on a very different trajectory into raging obesity
I agree that there's still some plausible thing that happened in 1980 that was different from the previous trend. There could be, and probably are, multiple causes of the trend of increasing weight over time. And as one trend loses steam, another could have taken over. To the extent that that's what you're saying, I agree.
But I'm still not sure I agree wit...
Conditional on accepting that there are two distinct linear regimes, the first linear regime from at least 1960 to 1976-80 is growing about 3x more slowly than the one from 1976-80 and on.
But we have data going back to the late 19th century, and it demonstrates that weight was increasing smoothly, and at a moderately fast rate, before 1960. That is a crucial piece of evidence. It shows that whatever happened in about 1980 could have simply been a minor part of a longer-term trend. I don't see why we would call that the "start" of the obesity epidemic.
I admit that the data is a bit fuzzy and hard to interpret. But ultimately, we've basically reached the point at which it's hard to tell whether the data supports an abrupt shift, which to me indicates that, even if we find such a shift, it's not going to be that large. The data could very well support a minor acceleration around 1980 (indeed I think this is fairly likely, from looking at the other data).
On the one hand, that means there are some highly interesting questions to explore about what happened around 1980! But on the other hand, I think the dat...
Accepting that obesity rates anywhere went up anywhere from 4x to 9x from 1900-1960 (i.e. from 1.5%-3% to 13.4%), I still think we have to explain the "elbow" in the obesity data starting in 1976-80. It really does look "steady around 10%" in the 1960-1976 era, with an abrupt change in 1976. If we'd continued to increase our obesity rates at the rate of 1960-74, we'd have less than 20% obesity today rather than the 43% obesity rate we actually experience. I think that is the phenomenon SMTM is talking about, and I think it's worth emphasizing.
I think the r...
Let's get a clearer illustration of your point. Here's a graph of the fraction of a normally distributed population above an arbitrary threshold as the population mean varies, ending when the population mean equals the threshold. In obesity terms, we start with a normally distributed population of BMI, and increase the average BMI linearly over time, ending when the average person is obese, and determine at each timepoint what fraction of the population is obese (so 50% obesity at the end).
The graph above has 4 timepoints roughly evenly spaced by the decad...
Some people seem to be hoping that nobody will ever make a misaligned human-level AGI thanks to some combination of regulation, monitoring, and enlightened self-interest. That story looks more plausible if we’re talking about an algorithm that can only run on a giant compute cluster containing thousands of high-end GPUs, and less plausible if we’re talking about an algorithm that can run on one 2023 gaming PC.
Isn't the relevant fact whether we could train an AGI with modest computational resources, not whether we could run one? If training runs are curtail...
That makes sense. However, Davinci-003 came out just a few days prior to ChatGPT. The relevant transition was from Davinci-002 to Davinci-003/ChatGPT.
[edit: this says the same thing as Quintin's sibling comment]
Important context for those who don't know it: the main difference between text-davinci-002 and text-davinci-003 is that the latter was trained with PPO against a reward model, i.e. RLHF as laid out in the InstructGPT paper. (Source: OpenAI model index.)
In more detail, text-davinci-002 seems to have been trained via supervised fune-tuning on the model outputs which were rated highest by human reviewers (this is what the model index calls FeedME). The model index only says that text-davinci-003 wa...
Yep, and text-davinci-002 was trained with supervised finetuning / written demos, while 003 was trained with RLHF via PPO. Hypothetically, the clearest illustration of RLHF's capabilities gains should be from comparing 002 to 003. However, OpenAI could have also used other methods to improve 003, such as with Transcending Scaling Laws with 0.1% Extra Compute.
This page also says that:
Our models generally used the best available datasets at the time of training, and so different engines using the same training methodology might be trained on different data.
I don't think this is right -- the main hype effect of chatGPT over previous models feels like it's just because it was in a convenient chat interface that was easy to use and free.
I don't have extensive relevant expertise, but as a personal datapoint: I used Davinci-002 multiple times to generate an interesting dialogue in order to test its capabilities. I ran several small-scale Turing tests, and the results were quite unimpressive in my opinion. When ChatGPT came out, I tried it out (on the day of its release) and very quickly felt that it was qualitati...
I think that even the pseudo-concrete "block progress in Y," for Y being compute, or data, or whatever, fails horribly at the concreteness criterion needed for actual decision making. [...] What the post does do is push for social condemnation for "collaboration with the enemy" without concrete criteria for when it is good or bad
There are quite specific things I would not endorse that I think follow from the post relatively smoothly. Funding the lobbying group mentioned in the introduction is one example.
I do agree though that I was a bit vague in my su...
But for the record, the workers due deserve to be paid for the value of the work that was taken.
I have complicated feelings about this issue. I agree that, in theory, we should compensate people harmed by beneficial economic restructuring, such as innovation or free trade. Doing so would ensure that these transformations leave no one strictly worse off, turning a mere Kaldor-Hicks improvement into a Pareto improvement.
On the other hand, I currently see no satisfying way of structuring our laws and norms to allow for such compensation fairly, or in a way th...
These numbers were based on the TAI timelines model I built, which produced a highly skewed distribution. I also added several years to the timeline due to anticipated delays and unrelated catastrophes, and some chance that the model is totally wrong. My inside view prediction given no delays is more like a median of 2037 with a mode of 2029.
I agree it appears the mode is much too near, but I encourage you to build a model yourself. I think you might be surprised at how much sooner the mode can be compared to the median.
I played with davinci, text-davinci-002, and text-davinci-003, if I recall correctly. The last model had only been out for a few days at most, however, before ChatGPT was released.
Of course, I didn't play with any of these models in enough detail to become an expert prompt engineer. I mean, otherwise I would have made the update sooner
Agreed. Taxing or imposing limits on GPU production and usage is also the main route through which I imagine we might regulate AI.
Congratulations. However, unless I'm mistaken, you simply said you'd be open to taking the bet. We didn't actually take it with you, did we?