Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is a thread for displaying your timeline until human-level AGI.

Every answer to this post should be a forecast. In this case, a forecast showing your AI timeline.

For example, here are Alex Irpan’s AGI timelines.

The green distribution is his prediction from 2015, and the orange distribution is his 2020 update (based on this post).

For extra credit, you can:

  • Say why you believe it (what factors are you tracking?)
  • Include someone else's distribution who you disagree with, and speculate as to the disagreement


How to make a distribution using Elicit

  1. Go to this page.
  2. Enter your beliefs in the bins.
    1. Specify an interval using the Min and Max bin, and put the probability you assign to that interval in the probability bin.
    2. For example, if you think there's a 50% probability of AGI before 2050, you can leave Min blank (it will default to the Min of the question range), enter 2050 in the Max bin, and enter 50% in the probability bin.
    3. The minimum of the range is January 1, 2021, and the maximum is January 1, 2100. You can assign probability above January 1, 2100 (which also includes 'never') or below January 1, 2021 using the Edit buttons next to the graph.
  3. Click 'Save snapshot,' to save your distribution to a static URL.
    1. A timestamp will appear below the 'Save snapshot' button. This links to the URL of your snapshot.
    2. Make sure to copy it before refreshing the page, otherwise it will disappear.
  4. Copy the snapshot timestamp link and paste it into your LessWrong comment.
    1. You can also add a screenshot of your distribution using the instructions below.


How to overlay distributions on the same graph

  1. Copy your snapshot URL.
  2. Paste it into the Import snapshot via URL box on the snapshot you want to compare your prediction to (e.g. the snapshot of Alex's distributions).
  3. Rename your distribution to keep track.
  4. Take a new snapshot if you want to save or share the overlaid distributions.


How to add an image to your comment

  • Take a screenshot of your distribution
  • Then do one of two things:
    • If you have beta-features turned on in your account settings, drag-and-drop the image into your comment
    • If not, upload it to an image hosting service, then write the following markdown syntax for the image to appear, with the url appearing where it says ‘link’: ![](link)
  • If it worked, you will see the image in the comment before hitting submit.


If you have any bugs or technical issues, reply to Ben (here) in the comment section.


Top Forecast Comparisons

Here is a snapshot of the top voted forecasts from this thread, last updated 8/25/20. You can click the dropdown box near the bottom right of the graph to see the bins for each prediction.

Here is a comparison of the forecasts as a CDF:


Here is a mixture of the distributions on this thread, weighted by normalized votes (last updated 8/23/20). The median is January 17, 2049. You can click the Interpret tab on the snapshot to see more percentiles.


Ω 29

New Answer
Ask Related Question
New Comment

14 Answers

A week ago I recorded a prediction on AI timeline after reading a Vox article on GPT-3 . In general I'm much more spread out in time than the Lesswrong community. Also, I weigh more heavily outside view considerations than detailed inside view information. For example, a main consideration of my prediction is using the heurastic With 50% probability, things will last twice as long as they already have, with the starting time of 1956, the time of the Dartmouth College summer AI conference.
If AGI will definitely happen eventually, then the heuristic gives us [21.3, 64, 192] years at the [25th, 50th, 75th] percentiles of AGI to occur. AGI may never happen, but the chance of that is small enough that adjusting for that here will not make a big difference (I put ~10% that AGI will not happen for 500 years or more, but it already matches that distribution quite well).

A more inside view consideration is: what happens if the current machine learning paradigm scales to AGI? Given that assumption, a 50% confidence interval might be [2028, 2045] (since the current burst of machine learning research began in 2012-2013), which is more in line with the Lesswrong predictions and Metaculus community prediction . Taking the super outside view consideration and the outside view-ish consideration together, I get the prediction I made a week ago.

I adapted my prediction to the timeline of this post [1], and compared it with some other commenters predictions [2].

Here is my Elicit Snapshot.

I'll follow the definition of AGI given in this Metaculus challenge, which roughly amounts to a single model that can "see, talk, act, and reason." My predicted distribution is a weighted sum of two component distributions described below:

  1. Prosaic AGI (25% probability). Timeline: 2024-2037 (Median: 2029): We develop AGI by scaling and combining existing techniques. The most probable paths I can foresee loosely involves 3 stages: (1) developing a language model with human-level language ability, then (2) giving it visual capabilities (i.e., talk about pictures and videos, solve SAT math problems with figures), and then (3) giving it capabilities to intelligently act in the world (i.e., trade stocks or navigate webpages). Below are my timelines for the above stages:
    1. Human-level Language Model: 1.5-4.5 years (Median: 2.5 years). We can predictably improve our language models by increasing model size (parameter count), which we can do in the following two ways:
      1. Scaling Language Model Size by 1000x relative to GPT3. 1000x is pretty feasible, but we'll hit difficult hardware/communication bandwidth constraints beyond 1000x as I understand.
      2. Increasing Effective Parameter Count by 100x using modeling tricks (Mixture of Experts, Sparse Tranformers, etc.)
    2. +Visual Capabilities: 2-6 extra years (Median: 4 years). We'll need good representation learning techniques for learning from visual input (which I think we mostly have). We'll also need to combine vision and language models, but there are many existing techniques for combining vision and language models to try here, and they generally work pretty well. A main potential bottleneck time-wise is that the language+vision components will likely need to be pretrained together, which slows the iteration time and reduces the number of research groups that can contribute (especially for learning from video, which is expensive). For reference, Language+Image pretrained models like ViLBERT came out 10 months after BERT did.
    3. +Action Capabilities: 0-6 extra years (Median: 2 years). GPT3-style zero-shot or few-shot instruction following is the most feasible/promising approach to me here; this approach could work as soon as we have a strong, pretrained vision+language model. Alternatively, we could use that model within a larger system, e.g. a policy trained with reinforcement learning, but this approach could take a while to get to work.
  2. Breakthrough AGI (75% probability). Timeline: Uniform probability over the next century: We need several, fundamental breakthroughs to achieve AGI. Breakthroughs are hard to predict, so I'll assume a uniform distribution that we'll hit upon the necessary breakthroughs at any year <2100, with 15% total probability mass after 2100 (a rough estimate); I'm estimating 15% roughly based on a 5% probability that we won't find the right insights by 2100, 5% probability that we have the right insights but not enough compute by 2100, and 5% probability to account for planning fallacy, unknown unknowns, and the fact that a number of top AI researchers believe that we are very far from AGI.

My probability for Prosaic AGI is based on an estimated probability of each of the 3 stages of development working (described above):

P(Prosaic AGI) = P(Stage 1) x P(Stage 2) x P(Stage 3) = 3/4 x 2/3 x 1/2 = 1/4


Updates/Clarification after some feedback from Adam Gleave:

  • Updated from 5% -> 15% probability that AGI won't happen by 2100 (see reasoning above). I've updated my Elicit snapshot appropriately.
  • There are other concrete paths to AGI, but I consider these fairly low probability to work first (<5%) and experimental enough that it's hard to predict when they will work. For example, I can't think of a good way to predict when we'll get AGI from training agents in a simulated, multi-agent environment (e.g., in the style of OpenAI's Emergent Tool Use paper). Thus, I think it's reasonable to group such other paths to AGI into the "Breakthrough AGI" category and model these paths with a uniform distribution.
  • I think you can do better than a uniform distribution for the "Breakthrough AGI" category, by incorporating the following information:
    • Breakthroughs will be less frequent as time goes on, as the low-hanging fruit/insights are picked first. Adam suggested an exponential decay over time / Laplacian prior, which sounds reasonable.
    • Growth of AI research community: Estimate the size of the AI research community at various points in time, and estimate the pace of research progress given that community size. It seems reasonable to assume that the pace of progress will increase logarithmically in the size of the research community, but I can also see arguments for why we'd benefit more or less from a larger community (or even have slower progress).
    • Growth of funding/compute for AI research: As AI becomes increasingly monetizable, there will be more incentives for companies and governments to support AI research, e.g., in terms of growing industry labs, offering grants to academic labs to support researchers, and funding compute resources - each of these will speed up AI development.

Here's my answer. I'm pretty uncertain compared to some of the others!

AI Forecast

First, I'm assuming that by AGI we mean an agent-like entity that can do the things associated with general intelligence, including things like planning towards a goal and carrying that out. If we end up in a CAIS-like world where there is some AI service or other that can do most economically useful tasks, but nothing with very broad competence, I count that as never developing AGI.

I've been impressed with GPT-3, and could imagine it or something like it scaling to produce near-human level responses to language prompts in a few years, especially with RL-based extensions.

But, following the list (below) of missing capabilities by Stuart Russell, I still think things like long-term planning would elude GPT-N, so it wouldn't be agentive general intelligence. Even though you might get those behaviours with trivial extensions of GPT-N, I don't think it's very likely.

That's why I think AGI before 2025 is very unlikely (not enough time for anything except scaling up of existing methods). This is also because I tend to expect progress to be continuous, though potentially quite fast, and going from current AI to AGI in less than 5 years requires a very sharp discontinuity.

AGI before 2035 or so happens if systems quite a lot like current deep learning can do the job, but which aren't just trivial extensions of them - this seems reasonable to me on the inside view - e.g. it takes us less than 15 years to take GPT-N and add layers on top of it that handle things like planning and discovering new actions. This is probably my 'inside view' answer.

I put a lot of weight on a tail peaking around 2050 because of how quickly we've advanced up this 'list of breakthroughs needed for general intelligence' -

There is this list of remaining capabilities needed for AGI in an older post I wrote, with the capabilities of 'GPT-6' as I see them underlined:

Stuart Russell’s List

human-like language comprehension

cumulative learning

discovering new action sets

managing its own mental activity

For reference, I’ve included two capabilities we already have that I imagine being on a similar list in 1960

perception and object recognition

efficient search over known facts

So we'd have discovering new action sets, and managing mental activity - effectively, the things that facilitate long-range complex planning, remaining.

So (very oversimplified) if around the 1980s we had efficient search algorithms, by 2015 we had image recognition (basic perception) and by 2025 we have language comprehension courtesy of GPT-8, that leaves cumulative learning (which could be obtained by advanced RL?), then discovering new action sets and managing mental activity (no idea). It feels a bit odd that we'd breeze past all the remaining milestones in one decade after it took ~6 to get to where we are now. Say progress has sped up to be twice as fast, then it's 3 more decades to go. Add to this the economic evidence from things like Modelling the Human Trajectory, which suggests a roughly similar time period of around 2050.

Finally, I think it's unlikely but not impossible that we never build AGI and instead go for tool AI or CAIS, most likely because we've misunderstood the incentives such that it isn't actually economical or agentive behaviour doesn't arise easily. Then there's the small (few percent) chance of catastrophic or existential disaster which wrecks our ability to invent things. This is the one I'm most unsure about - I put 15% for both but it may well be higher.

Here's my quick forecast, to get things going. Probably if anyone asks me questions about it I'll realise I'm embarrassed by it and change it.


It has three buckets:

10%: We get to AGI with the current paradigm relatively quickly without major bumps.

60%: We get to it eventually sometime in the next ~50 years.

30%: We manage to move into a stable state where nobody can unilaterally build an AGI, then we focus on alignment for as long as it takes before we build it.

2nd attempt

Adele Lopez is right that 30% is super optimistic. Also I accidentally put a bunch within '2080-2100', instead of 'after 2100'. And also I thought about it more. here's my new one.

My distribution is the fat blue one.


It has four buckets:

20% Current work leads directly into AI in the next 15 years.

55% There are some major bottlenecks, new insights needed, and some engineering projects comparable in size to the manhattan project. This is 2035 to 2070.

10% This is to fill out 2070 to 2100.

15% We manage to move to a stable state, or alternatively civilizational collapse / non-AI x-risk stops AI research. This is beyond 2100.

Here is my snapshot. My reasoning is basically similar to Ethan Perez', it's just that I think that if transformative AI is achievable in the next five orders of magnitude of compute improvement (e.g. prosaic AGI?), it will likely be achieved in the next five years or so. I also am slightly more confident that it is, and slightly less confident that TAI will ever be achieved. 

I am aware that my timelines are shorter than most... Either I'm wrong and I'll look foolish, or I'm right and we're doomed. Sucks to be me.
[Edited the snapshot slightly on 8/23/2020]
[Edited to add the following powerpoint slide that gets a bit more at my reasoning]

My rough take:

3 buckets, similar to Ben Pace's 

  1. 5% chance that current techniques just get us all the way there, e.g. something like GPT-6 is basically AGI
  2. 10% chance AGI doesn't happen this century, e.g. humanity sort of starts taking this seriously and decides we ought to hold off + the problem being technically difficult enough that small groups can't really make AGI themselves
  3. 50% chance that something like current techniques and some number of new insights gets us to AGI. 

If I thought about this for 5 additional hours, I can imagine assigning the following ranges to the scenarios:

  1. [1, 25]
  2. [1, 30]
  3. [20, 80]

Roughly my feelings:

Reasoning: I think lots of people have updated too much on GPT-3, and that the current ML paradigms are still missing key insights into general intelligence. But I also think enough research is going into the field that it won't take too long to reach those insights.

Here's my prediction:

To the extent that it differs from others' predictions, probably the most important factor is that I think even if AGI is hard, there are a number of ways in which human civilization could become capable of doing almost arbitrarily hard things, like through human intelligence enhancement or sufficiently transformative narrow AI. I think that means the question is less about how hard AGI is and more about general futurism than most people think. It's moderately hard for me to imagine how business as usual could go on for the rest of the century, but who knows.


I (a non-expert) heavily speculate the following scenario for an AGI based on Transformer architectures:

The scaling hypothesis is likely correct (and is the majority of the probability density for the estimate), and maybe only two major architectural breakthroughs are needed before AGI. The first is a functioning memory system capable of handling short and long term memories with lifelong learning without the problems of fine tuning.

The second architectural breakthrough needed would be allowing the system to function in an 'always on' kind of fashion. For example current transformers get an input then spit an output and are done. Where as a human can receive an input, output a response, but then keep running, seeing the result of their own output. I think an 'always on' functionality will allow for multi-step reasoning, and functional 'pruning' as opposed to 'babble'. As an example of what I mean, think of a human carefully writing a paragraph and iterating and fixing/rewriting past work as they go, rather than just the output being their stream of consciousness. Additionally it could allow a system to not have to store all information within its own mind, but rather use tools to store information externally. Getting an output that has been vetted for release rather than a thought stream seems very important for high quality.

Additionally I think functionality such as agent behavior and self awareness only require embedding an agent in a training environment simulating a virtual world and its interactions (See ). I think this may be the most difficult to implement, and there are uncertainties. For example does all training need to take place within this environment? Or is only an additional training run after it has been trained like current systems necessary.

I think such a system utilizing all the above may be able to introspectively analyse its own knowledge/model gaps and actively research to correct them. I think that could cause a discontinuous jump in capabilities.

I think that none of those capabilities/breakthroughs seem out of reach this decade, that that scaling will continue to quadrillions of parameters by the end of the decade (in addition to continued efficiency improvements). 

I hope an effective control mechanism can be found by then. (Assuming any of this is correct, 5 months ago I would have laughed at this.). 

Here is a link to my forecast

AGI Timeline

And here are the rough justifications for this distribution:

I don't have much else to add beyond what others have posted, though it's in part influenced by an AIRCS event I attended in the past.  Though I do remember being laughed at for suggesting GPT-2 represented a very big advance toward AGI.  

I've also never really understood the resistance to why current models of AI are incapable of AGI.  Sure, we don't have AGI with current models, but how do we know it isn't a question of scale?  Our brains are quite efficient, but the total energy consumption is comparable to that of a light bulb.  I find it very hard to believe that a server farm in an Amazon, Microsoft, or Google Datacenter would be incapable of running the final AGI algorithm.  And for all the talk of the complexity in the brain, each neuron is agonizingly slow (200-300Hz).

That's also to say nothing of the fact that the vast majority of brain matter is devoted to sensory processing.  Advances in autonomous vehicles are already proving that isn't an insurmountable challenge.  

Current AI models are performing very well at pattern recognition.  Isn't that most of what our brains do anyway?  

Self attended recurrent transformer networks with some improvements to memory (attention context) access and recall to me look very similar to our own brain.  What am I missing?

My snapshot:

Idk what we mean by "AGI", so I'm predicting when transformative AI will be developed instead. This is still a pretty fuzzy target: at what point do we say it's "transformative"? Does it have to be fully deployed and we already see the huge economic impact? Or is it just the point at which the model training is complete? I'm erring more on the side of "when the model training is complete", but also there may be lots of models contributing to TAI, in which case it's not clear which particular model we mean. Nonetheless, this feels a lot more concrete and specific than AGI.

Methodology: use a quantitative model, and then slightly change the prediction to account for important unmodeled factors. I expect to write about this model in a future newsletter.

My estimate is very different from what others suggest and this stems from my background and my definition of AGI. I see AGI as human-level intelligence. If we present a problem to an AGI system, we would expect that it does not make any "silly" mistakes, but that it makes reasonable responses like a competent human would do.

My background: I work in deep learning on very large language models. I worked on the parallelization of deep learning in the past. I also have in-depth knowledge of GPUs and accelerators in general. I developed the fasted algorithm for top-k sparse-sparse matrix multiplication on a GPU.

I wrote about this 5-years ago, but since then my opinion has not changed: I believe that AGI will be physically impossible with classical computers.

It is very clear that intelligence is all about compute capabilities. The intelligence of mammals is currently limited energy intake — including the intelligence of humans. I believe that the same is true for AI algorithms and these patterns seem to be very clear if you look at the trajectory of compute requirements over the past years.

The main issues are these: You cannot build structures smaller than atoms; you can only dissipate a certain amount of heat per square area; the smaller the structures are that you print with lasers, the higher the probability of artifacts; light can only go a certain distance per second; the speed of SRAM scales sub-linearly with its size. These are all hard physical boundaries that we cannot alter. Yet, all these physical boundaries will be hit within a couple of years and we will fall very, very far short of human processing capabilities and our models will not improve much further. Two orders of magnitude of additional capability are realistic, but anything beyond that is just wishful thinking.

You may say it is just a matter of scale. Our hardware will not be as fast as brains but we just build more of them. Well, the speed of light is fixed and networking scales abysmally. With that, you have a maximum cluster size that you cannot exceed without losing processing power. The thing is, even in current neural networks, doubling the number of GPUs sometimes doubles training time. We will design neural networks that scale better, such as mixtures of experts, but this will not be nearly enough (this will give us another 1-2 orders of magnitude).

We will switch to wafer-scale compute in the next years and this will yield a huge boost in performance, but even wafer-scale chips will not yield the long-term performance that we need to get anywhere near AGI.

The only real possibility that I see is quantum computing. It is currently not clear what quantum AI algorithms would look like but we seem to be able to scale quantum computers double exponentially over time aka Neven's Law. The real question is if quantum error correction also scales exponentially (the current data suggests this) or if it can scale sub-exponentially. If it scales exponentially, quantum computers will be interesting for chemistry, but they would be useless for AI. If it scales sub-exponentially we will hit quantum computers that are faster than classical computers in 2035. Due to double exponential scaling, the quantum computers in 2037 would be an unbelievable amount more powerful than all classical computers before. We might not be able to reach AGI still because we cannot feed such computer data quickly enough to keep up with its processing power, but I am sure we might be able to figure something out to feed a single powerful quantum computer. Currently, the input requirements for pretraining large deep learning models are minimal for natural language but high for images and videos. As such, we might still not have AGI with quantum computers, but we would have computers with excellent natural language processing capabilities.

If quantum computer do not work, I would predict that we never will reach AGI — hence the probability is zero after the peak in 2035-2037.

If the definition of AGI is relaxed and models are allowed to make "stupid mistakes" and the requirement is just that they on average solve problems better than humans. I would be pretty much in line with Ethan's predictions (Ethan and I chatted about this before).

Edit: A friend told me that it is better to put my probability after 2100 if I believe it is not happening after the peak 2037. Updated the graph.

Here is a quick approximation of mine, I want more powerful Elicit features to make it easier to translate from sub-problem beliefs to final probability distribution. Without taking the time to write code, I had to intuit some translations.