Hi. I'm Gareth McCaughan. I've been a consistent reader and occasional commenter since the Overcoming Bias days. My LW username is "gjm" (not "Gjm" despite the wiki software's preference for that capitalization). Elsewehere I generally go by one of "g", "gjm", or "gjm11". The URL listed here is for my website and blog, neither of which has been substantially updated for several years. I live near Cambridge (UK) and work for Hewlett-Packard (who acquired the company that acquired what remained of the small company I used to work for, after they were acquired by someone else). My business cards say "mathematician" but in practice my work is a mixture of simulation, data analysis, algorithm design, software development, problem-solving, and whatever random engineering no one else is doing. I am married and have a daughter born in mid-2006. The best way to contact me is by email: firstname dot lastname at pobox dot com. I am happy to be emailed out of the blue by interesting people. If you are an LW regular you are probably an interesting person in the relevant sense even if you think you aren't.

If you're wondering why some of my very old posts and comments are at surprisingly negative scores, it's because for some time I was the favourite target of old-LW's resident neoreactionary troll, sockpuppeteer and mass-downvoter.

Wiki Contributions


It is not true that "no pattern that suggests a value suggests any other", at least not unless you say more precisely what you are willing to count as a pattern.

Here's a template describing the pattern you've used to argue that 1+2+...=-1/12:

We define numbers  with the following two properties. First, , so that for each  we can think of  as a sequence that's looking more and more like (1,2,3,...) as  increases. Second,  where , so the sums of these sequences that look more and more like (1,2,3,...) approach -1/12.

(Maybe you mean something more specific by "pattern". You haven't actually said what you mean.)

Well, here are some  to consider. When  we'll let . When  we'll let . And when  we'll let . Here,  is some fixed number; we can choose it to be anything we like.

This array of numbers satisfies our first property: . Indeed, once  we have , and the limit of an eventually-constant sequence is the thing it's eventually constant at.

What about the second property? Well, as you'll readily see I've arranged that for each  we have . So the sequence of sums converges to .

In other words, this is a "pattern" that makes the sum equal to . For any value of  we choose.

I believe there are more stringent notions of "pattern" -- stronger requirements on how the  approach  for large  -- for which it is true that every "pattern" that yields a finite sum yields . But does this actually end up lower-tech than analytic continuation and the like? I'm not sure it does.

(One version of the relevant theory is described at https://terrytao.wordpress.com/2010/04/10/the-euler-maclaurin-formula-bernoulli-numbers-the-zeta-function-and-real-variable-analytic-continuation.)

Once again you are making a ton of confident statements and offering no actual evidence. "is a high priority", "they want", "they don't want", "what they're aiming for is", etc. So far as I can see you don't in fact know any of this, and I don't think you should state things as fact that you don't have solid evidence for.

Let us suppose that social media apps and sites are, as you imply, in the business of trying to build sophisticated models of their users' mental structures. (I am not convinced they are -- I think what they're after is much simpler -- but I could be wrong, they might be doing that in the future even if not now, and I'm happy to stipulate it for the moment.)

If so, I suggest that they're not doing that just in order to predict what the users will do while they're in the app / on the site. They want to be able to tell advertisers "_this_ user is likely to end up buying your product", or (in a more paranoid version of things) to be able to tell intelligence agencies "_this_ user is likely to engage in terrorism in the next six months".

So inducing "mediocrity" is of limited value if they can only make their users more mediocre while they are in the app / on the site. In fact, it may be actively counterproductive. If you want to observe someone while they're on TikTok and use those observations to predict what they will do when they're not on TikTok, then putting them into an atypical-for-them mental state that makes them less different from other people while on TikTok seems like the exact opposite of what you want to do.

I don't know of any good reason to think it at all likely that social media apps/sites have the ability to render people substantially more "mediocre" permanently, so as to make their actions when not in the app / on the site more predictable.

If the above is correct, then perhaps we should expect social media apps and sites to be actively trying not to induce mediocrity in their users.

Of course it might not be correct. I don't actually know what changes in users' mental states are most helpful to social media providers' attempts to model said users, in terms of maximizing profit or whatever other things they actually care about. Are you claiming that you do? Because this seems like a difficult and subtle question involving highly nontrivial questions of psychology, of what can actually be done by social media apps and sites, of the details of their goals, etc., and I see no reason for either of us to be confident that you know those things. And yet you are happy to declare with what seems like utter confidence that of course social media apps and sites will be trying to induce mediocrity in order to make users more predictable. How do you know?

"Regression to the mean" is clearly an important notion in this post, what with being in the title and all, but you never actually say what you mean by it. Clearly not the statistical phenomenon of that name, as such.

(My commenting only on this should not be taken to imply that I find the rest of the post reasonable; I think it's grossly over-alarmist and like many of Trevor's posts treats wild speculation about the capabilities and intentions of intelligence agencies etc. as if it were established fact. But I don't think it likely that arguing about that will be productive.)

What's going on is that tailcalled's factor model doesn't in fact do a good job of identifying rationalists by their sociopolitical opinions. Or something like that.

[EDITED to add:] Here's one particular variety of "something like that" that I think may be going on: an opinion may be highly characteristic of a group even if it is very uncommon within the group. For instance, suppose you're classifying folks in the US on a left/right axis. If someone agrees with "We should abolish the police and close all the prisons" then you know with great confidence which team they're on, but I'm pretty sure the great majority of leftish people in the US disagree with it. If someone agrees with "We should bring back slavery because black people aren't fit to run their own lives" then you know with great confidence which team they're on, but I'm pretty sure the great majority of rightish people in the US disagree with it.

Tailcalled's model isn't exactly doing this sort of thing to rationalists -- if someone says "stories about ghosts are zero evidence of ghosts" then they have just proved they aren't a rationalist, not done something extreme but highly characteristic of (LW-style) rationalists -- but it's arguably doing something of the sort to a broader fuzzier class of people that are maybe as near as the model can get to "rationalists". Roughly the people some would characterize as "Silicon Valley techbros".


There are definitely answers that your model wants rationalists to give but that I think are incompatible with LW-style rationalism. For instance:

  • "People's anecdotes about seeing ghosts aren't real evidence for ghosts" (your model wants "agree strongly"): of course people's anecdotes about seeing ghosts are evidence for ghosts; they are more probable if ghosts are real than if they aren't. They're just really weak evidence for ghosts and there are plenty of other reasons to think there aren't ghosts.
  • "We need more evidence that we would benefit before we charge ahead with futuristic technology that might irreversibly backfire" (your model wants "disagree" or "disagree strongly"): there's this thing called the AI alignment problem that a few rationalists are slightly concerned about, you might have heard of it.

And several others where I wouldn't go so far as to say "incompatible" but where I confidently expect most LWers' positions not to match your model's predictions. For instance:

  • "It is morally important to avoid making people suffer emotionally": your model wants not-agreement, but I think most LWers would agree with this.
  • "Workplaces should be dull to reflect the oppressiveness of work": your model wants not-disagreement, but I think most LWers would disagree (though probably most would think "hmm, interesting idea" first).
  • "Religious people are very stupid"; your model wants agreement, but I think most LWers are aware that there are plenty of not-very-stupid religious people (indeed, plenty of very-not-stupid religious people) and I suspect "disagree strongly" might be the most common response from LWers.

I don't claim that the above lists are complete. I got 11/24 and I am pretty sure I am nearer the median rationalist than that might suggest.

I don't have particularly strong opinions and think you should do whatever you like with your name, but just as a datapoint I (1) didn't think "the gears to ascension" was either so cool a name as to demand respect or so stupid a name as to preclude it, and (2) don't think the "often wrong" in your name will make much difference to how I read your comments.

I don't think it ever occurred to me to think that calling yourself "the gears to ascension" amounted to claiming to be a key part of some transhumanist project or anything like that. The impression it gave me was "transhumanist picking a name that sounds cool to them".

The "often wrong" provokes the following thoughts: (1) this person is aware of often being wrong, which is more than most people are, so maybe take them more seriously? (2) this person is, by their own account, often wrong, so maybe take them less seriously? (3) this person is maybe doing a sort of defensive self-deprecatory fishing-for-compliments thing, so maybe take them less seriously? but all of these are pretty weak effects, and I think 2+3 more or less exactly cancel out 1.

"Lauren (often wrong)" is probably about equally memorable to "the gears to ascension". if your goal is to have all your comments stand on their own, then aside from the one-off effect of reducing the association between things said by "Lauren" and things said by "gears" I don't think the change will do much one way or the other. "Lauren" on its own is probably less memorable and your comments might be treated as more independent of one another if you just called yourself that. (But there appear already to be two users called just Lauren, so something slightly more specific might be better.)


The trouble with these rules is that they mean that someone saying "I played the AI-box game and I let the AI out" gives rather little evidence that that actually happened. For all we know, maybe all the stories of successful AI-box escapes are really stories where the gatekeeper was persuaded to pretend that they let the AI out of the box (maybe they were bribed to do that; maybe they decided that any hit to their reputation for strong-mindedness was outweighed by the benefits of encouraging others to believe that an AI could get out of the box; etc.). Or maybe they're all really stories where the AI-player's ability to get out of the box depends on something importantly different between their situation and that of a hypothetical real boxed AI (again, maybe they bribed the gatekeeper and the gatekeeper was willing to accept a smaller bribe when the outcome was "everyone is told I let the AI out" rather than whatever an actual AI might do once out of the box; etc.).

Of course, even without those rules it would still be possible for gatekeepers to lie about the results. But if e.g. a transcript were released then there'd be ways to try to notice those failure modes. If the gatekeeper-player lets the AI-player out of the box and a naysayer says "bah, I wouldn't have been convinced", that could be self-delusion on the naysayer's part (or unawareness that someone playing against them might have adopted a different method that would have worked better on them) but it could also be that the gatekeeper-player really did let the AI-player out "too easily" in a way that wouldn't transfer to the situations the game is meant to build intuitions about.


Do you have some concrete examples where you've explained how some substantial piece of the case for AI accident risk is a matter of word games?

Load More