Crossposted from the AI Alignment Forum. May contain more technical jargon than usual. has now launched. It seems worthwhile to discuss both what it means for AI safety and whether people interested in AI safety should consider applying for the company.

Some thoughts:

It's notable that Dan Hendrycks is listed as an advisor (the only advisor listed).

The team is also listed on the page.

I haven't taken the time to do so, but it might be informative for someone to Google the individuals listed to discover whether they lie in terms of their interest being between capabilities and safety.

New to LessWrong?

New Answer
New Comment

1 Answers sorted by

The only team member whose name is on the CAIS extinction risk statement is Tony (Yuhuai) Wu.

(Though not everyone who signed the statement is listed under it, especially if they're less famous. And I know one person in the xAI team who has privately expressed concern about AGI safety in ~2017.)

Igor Babuschkin has also signed it.

1 comment, sorted by Click to highlight new comments since: Today at 1:50 AM

This has no direct connection to AI safety, but when I saw that the supposed goal is "to understand the true nature of the universe", I took it personally, because I've spent so much effort on precisely that goal: taught myself everything up to string theory, studied philosophers both famous and obscure, developed my own hypotheses... Maybe this is how human artists feel about the advent of AI art. 

But if I put those first feelings aside, and ask myself how this might turn out: 

One possibility is that it becomes nothing more than a summarizer of known opinions, a mix of Google Scholar, Wikipedia, and the Stanford Encyclopedia of Philosophy, brought to life in typical LLM fashion. 

Another is that it becomes a mouthpiece for whatever philosophy Elon Musk and his crew of AI technicians already implicitly prefer. 

Still another possibility is that a few, or even many, recognized human thinkers sign on as consultants regarding the philosophical or scientific issues that X.AI is ultimately meant to answer. 

An obvious issue is that Musk wants his AI to engage in genius-level thought, and no one knows how to automate that. You can discuss philosophy, or anything you like, with existing AIs, and the results can even be interesting, but we don't really understand the process that produces those results. (Though perhaps we have a better chance of understanding it, than of understanding human philosophical cognition.) 

Musk has also tweeted "It’s not AGI until it can solve at least one fundamental physics problem". Whether he means a solution to a quantitative problem or an ontological problem, he doesn't say. 

So a further possibility is that X.AI will eventually function as intended, as an oracle regarding the very deepest questions we have. There are multiple possible outcomes here: it finds the right answers and it is obvious that they are right, it finds the wrong answers but they are treated as the right answers, it reveals hard limits to what this form of computation can accomplish... 

Again depending on what kind of artificial thinker it is, X.AI could have an immediate impact on the highest form of AI safety, the "superalignment" problem to which CEV was proposed as an answer. Superalignment, as I see it, is about producing an autonomous superintelligence that is nonetheless human-friendly. That means, among other things, giving it an ethical and metaethical philosophy; a notion of good and evil. If X.AI can tell us the nature of reality, that means it can also tell us the nature of good and evil. 

Of course, the problem is that if X.AI is just a language model of the kind we already know about, it may become nothing more than a glib and consistent mouthpiece for a predetermined philosophy, rather than an independent thinker with its own criteria for resolving intellectual questions.