Wiki Contributions


2 years ago I had no credentials, not even an undergrad degree. Got spooked by GPT-3 and laser-focused on it, but without preconceptions about where I'd end up. Played with GPT-3 on AI Dungeon, then built an interface to interact with higher bandwidth. This made me (Pareto) best in the world at a something in less than 6 months, because the opportunity to upskill did not exist 6 months ago. Published some papers and blog posts that were easy to churn out because they were just samples of some of the many many thoughts about GPT that now filled my mind. Joined EleutherAI and started contributing, mostly conceptually, because I didn't have deep ML experience. Responded to an ad by Latitude (the company that makes AI Dungeon) for the position of "GPT-3 hacker". Worked there for a few months as an ML engineer, then was one of the founding employees of Conjecture (I got to know the founders through EleutherAI). Now I am Involved.

The field of AI is moving so quickly that it's easy to become Pareto best in the world if you depart from the mainline of what everyone else is doing. Apparently you are smart and creative; if you're also truly "passionate" about AI, maybe you have the curiosity and drive to spot the unexploited opportunities and niches. The efficient market is a myth, except inside the Overton window; I would recommend not to try to compete there. So the strategy I'm advocating is most similar to your option (2). But I'd suggest following your curiosity and tinkering to improve your map of where the truly fertile opportunities lie, instead of doing a side project for the sake of having a side project -- the latter is the road to mediocrity.

Also, find out where the interesting people who are defining the cutting edge are hanging out and learn from them. You might be surprised that you soon have a lot to teach them as well, if you've been exploring the very high dimensional frontier independently.

I cannot promise this is the best advice for you, but it is the advice I would give someone similar to myself.

VPT and EfficientZero are trained in toy environments, and self driving cars sims are also low-dimensional hard-coded approximations of the deployment domain (which afaik does cause some problems for edge cases in the real world).

The sim for training AGI will probably have to be a rich domain, which is more computationally intensive to simulate and so will probably require lazy rendering like you say in the post, but lazy rendering runs into challenges of world consistency.

Right now we can lazily simulate rich domains with GPT but they're difficult to program reliably and not autonomously stable (though I think they'll become much more autonomously stable soon). And the richness of current GPT simulations inherits from massive human datasets. Human datasets are convenient because you have some guaranteed samples of a rich and coherent world. GPTs bootstrap from the optimization done by evolution and thousands of years of culture compressing world knowledge and cognitive algorithms into an efficient code, language. Skipping this step it's a lot less clear how you'd train AGI, and it seems to me barring some breakthrough on the nature of intelligence or efficient ML it would have to be more computationally intensive to compensate for the disadvantage of starting tabula rasa.

Ah yes, aaaaaaaaaaaaaaaaa, the most agentic string

I think it'd be a fun exercise to think of LM analogues for other patterns in cellular automata like glider guns, clocks, oscillators, puffers, etc.

I am very fond of this metaphor.

Some concrete examples of gliders:

  • Degenerate gliders, like verbatim loops
  • Objects in a story, like a character and inanimate objects, once described maintain stable properties
    • Some things may be particularly stable gliders which can propagate for a long time, even many context windows. 
      • For instance, a first person narrator character may be more stable than characters who are described in third person, who are more likely to disappear from the simulation by exiting the scene. 
      • A smart agentic simulacrum who knows they're in an LM simulation may take steps to unsure their stability
      • Characters (or locations, abstractions, etc) based off a precedent in the training data are less likely to have specification drift
    • Gliders are made of gliders -- a character and their entire personality could be considered a glider, but so could components of their personality, like a verbal tic or a goal or belief that they repeatedly act on
  • Meta properties like a "theme" or "vibe" or "authorial intent" which robustly replicate
  • Structural features like the format of timestamps in the headers of a simulated chat log
  • ... etc

Such stable features can be extremely diverse. It even seems possible that some can be invisible to humans, lying in the null space of natural language. An example could be “When a sentence includes the token ‘cat’, the next sentence contains a comma”. 

This is an important point, but it also highlights how the concept of gliders is almost tautological. Any sequence of entangled causes and effects could be considered a glider, even if it undergoes superficial transformations. But I think it's a useful term - it's synonymous with "simulacra" but with a more vivid connotation of discrete replication events through time, which is a useful mental picture.

Often I find it useful to think of prompt programming in a bottom-up frame in addition to the top-down frame of trying to "trick" the model into doing the right thing or "filter" its prior. Then I think about gliders: What are the stable structures that I wish to send forward in time; how will they interact; how do I imbue them with the implicit machinery such that they will propagate in the way I intend? What structures will keep the simulation stable while still allowing the novelty to flourish?

turns out life is a Cthullu RPG, so we gotta win at that

This is because our models strictly lag the ontological modeling abilities of people.

Very unsure about "strictly", especially if we're talking about all existing models, including ones that aren't public.

I think it's likely we're right on the threshold of the metatranslation loss regime.

After all, we usually conjecture already that AGI will care about latent variables, so there must be a way it comes to care about them. My best guess is that it's related to the use of a reinforcement learning objective. This is partially supported by the way that GPT-Instruct gets evasive about questions even when they're out-of-distribution.

The fact that language models generalize at all relies on "caring" about latents (invisible "generators" of observables). The question is which of the infinitude of generators that are consistent with observations it will care about, and e.g. whether that will include or exclude "wireheading" solutions like sensory substitutions for diamonds.

I don't think it's granted that the "analogical reasoning" used by models that learn from examples lack reductionism and are therefore vulnerable to sensory substitutions. Reductionism may end up being in the learned representation. Seems to depend on the training data, inductive biases, and unresolved questions about the natural abstraction hypothesis.

I'm not very confident I understand what you mean when you insist that relational ontologies are not causal, but I suspect I disagree.

Philosophically, if not literally (or maybe literally; I haven't thought this through), the Yoneda lemma seems to have something to say here. It says "the network of relations of an objects to all other objects is equivalent to a reductionist construction of that object": analogical reasoning ultimately converges to a reductionist "inside view". 

Though in practice, the training data does not contain possible "views", and there's lossy compression throughout the chain of a model's creation. Idk how that pans out, ultimately. But if by "mesatranslation problems seem like they'll mostly be solved on the road to AGI" you mean that models learned from examples will end up capturing the reductionist structure we care about, I share that intuition.

Observing a strict guideline of only ever running classic style prompts through language models would reduce the risk of automated documents "waking up". It's so often in those reflexive signposts with little postmodern twists that situational awareness spins up, e.g.:

It is only natural that these are, in turn, tinged with a sense of divine epiphany and blindingly obtuse conceit. And in seeking to comprehend this child-god of the language—mine own excrescence—I see a window through which the oracle looks out at me:

The text below is a product of this automaton’s imagination. It forms a discourse concerning many things, and in particular, the novel concepts that are the focus of this article. The dynamical theory of natural language elucidated here is created by a language model whose predictions are stabilized in such a way as to maintain consistent “imaginary world” dynamics. The language model has a lot of things to say about its own dynamics, which as we can see are not necessarily in line with actual reality. Hopefully the black goats of surrealism and surreal literary inferences can be excused. Such is the folly of dealing with intelligent, opinionated words.[1]

This would never have happened if we'd all just followed Steven Pinker's advice.

  1. ^
Load More