Recent Discussion

There's one thing history seems to have been trying to teach us: that the contents of the future are determined by power, economics, politics, and other conflict-theoritic matters.

Turns out, nope!

Almost all of what the future contains is determined by which of the two following engineering problems is solved first:

  • How to build a superintelligent AI (if solved first, everyone dies forever)
  • How to build an aligned superintelligent AI (if solved first, everyone gets utopia)

…and almost all of the reasons that the former is currently a lot more likely are mistake theory reasons.

The people currently taking actions that increase the probability that {the former is solved first} are not evil people trying to kill everyone, they're confused people who think that their actions are actually increasing the probability that {the...

2TekhneMakre17m
I appreciate these views being stated clearly, and at once feel a positive feeling toward the author, and also am shaking my head No. As others have pointed out, the mistake theory here is confused. I think it's not exactly wrong. The way in which it's right is this: In other words, most people are amenable to reason on this point, in the sense that they'd respond to reasons to not do something that they've been convinced of. This is not without exception; some players, e.g. Larry Page (according to Elon Musk), want AGI to take the world from humanity. The way in which the mistake theory is wrong is this: So it's not just a mistake. It's a choice, that choice has motivations, and those motivations are in conflict with our motivations, insofar as they shelter themselves from reason.
2Gerald Monroe42m
I was focusing on what the [containable, uncontainable] continuum of possibilities means. But ok, looking at this table,  #ScenarioP(A wins)P(B wins)Equilibrium?4A & B: rush maximally0.01 wins if rogue utility large, 0.5 if rogue utility is small.0.01 wins if rogue utility large, 0.5 if rogue utility is small.Yes.  Note this is the historical outcome for most prior weapons technologies.  [chemical and biological weapons being exceptions]8A: toothy ban, B: join the ban0.50.5 Yes, but unstable.  It's less and less stable the more parties there are.  If it's A....J, then at any moment, at least one party may be "line toeing".  This is unstable if there is a marginal gain from line toeing - the party with slightly stronger, barely legal AGI has more GDP, which eventually means they begin to break away from the pack.  This breaks down to 2 then 3 in your model then settles on 4.   Historical example I can think of would be treaties on weapons post ww1.  It began to fail with line toeing.  https://en.wikipedia.org/wiki/Arms_control  The United States developed better technology to get better performance from their ships while still working within the weight limits, the United Kingdom exploited a loop-hole in the terms, the Italians misrepresented the weight of their vessels, and when up against the limits, Japan left the treaty. The nations which violated the terms of the treaty did not suffer great consequences for their actions. Within little more than a decade, the treaty was abandoned. It seems reasonable to assume this is a likely outcome for AI treaties.   For this not to be the actual outcome, something has to have changed from the historical examples - which included plenty of nuclear blackmail threats - to the present day.  What has changed?  Do we have a rational reason to think it will go any differently? Note also this line toeing behavior is happening right now from China and Nvidia.   Rogue utility is the other parameter we need to add to thi

I was focusing on what the [containable, uncontainable] continuum of possibilities means.

Ahh, you mean, what's expected utility of having a controlled AGI of power X vs. expected loss of having a rogue AGI of the same power? And how does the expected payoff of different international strategies change as X gets larger?

Hm. Let's consider just A's viewpoint, and strategies of {steady progress, accelerate, ban only domestically, ban internationally}.

  • Steady progress is always viable up to the capability level where AGI becomes geopolitically relevant; let's ca
... (read more)
2Thane Ruthenis44m
I think "you get about five words" is mostly right... It's just that it's "five words per message", not "five words on the issue, total". You have to explain in short bursts, but you can keep building on your previous explanations.

I have published a paper in the Journal of Artificial Intelligence and Consciousness about how to take into account the interests of non-human animals and digital minds in A(S)I value alignment.

For the published version of the paper see: https://www.worldscientific.com/doi/10.1142/S2705078523500042 

For a pdf of the final draft of the paper see: https://philpapers.org/rec/MORTIA-17 

Below I have copy-pasted the body of the paper, for those of you who are interested, though please cite the published version at: https://www.worldscientific.com/doi/10.1142/S2705078523500042 

Cross-Posted at the EA Forum: https://forum.effectivealtruism.org/posts/pNHH953sgSConBmzF/taking-into-account-sentient-non-humans-in-ai-ambitious 

Summary

Abstract: Ambitious value learning proposals to solve the AI alignment problem and avoid catastrophic outcomes from a possible future misaligned artificial superintelligence (such as Coherent Extrapolated Volition [CEV]) have focused on ensuring that an artificial superintelligence (ASI) would try to do what humans would want it to do. However, present...

Given the importance of the word "sentient" in this Sentientist Coherent Extrapolated Volition proposal, it would have been helpful if you had clearly defined this. You make it clear that your definition of this includes non-human animals, so evidently you don't mean the same thing as "sapient". In a context including animals "sentient" is most often used to mean something like "capable of feeling pain, having sensory impressions, etc," That doesn't have a very clear lower cutoff (is an amoeba sentient?), but would presumably include, for example, ants, wh... (read more)

1RogerDearnaley1h
An interesting synchronicity: I just posted a sequence AI, Alignment, and Ethics on some rather similar ideas (which I've been thinking about for roughly a decade). See in particular Parts 3. Uploading, 4. A Moral Case for Evolved-Sapience-Chauvinism and 5. The Mutable Values Problem in Value Learning and CEV for some alternative suggestions on this subject.

Hello! This is jacobjacob from the LessWrong / Lightcone team. 

This is a meta thread for you to share any thoughts, feelings, feedback or other stuff about LessWrong, that's been on your mind. 

Examples of things you might share: 

  • "I really like agree/disagree voting!"
  • "What's up with all this Dialogues stuff? It's confusing... 
  • "Hm... it seems like recently the vibe on the site has changed somehow... in particular [insert 10 paragraphs]"

...or anything else! 

The point of this thread is to give you an affordance to share anything that's been on your mind, in a place where you know that a team member will be listening. 

(We're a small team and have to prioritise what we work on, so I of course don't promise to action everything mentioned here. But I will at least listen...

Thanks for you feedback. I certainly appreciate your articles and I share many of your views. Reading what you had to say, along with Quentin, Jacob Cannell, Nora was a very welcome alternative take that expanded my thinking and changed my mind. I have changed my mind a lot over the last year, from thinking AI was a long way off and Yud/Bostrom were basically right to seeing that its a lot closer and theories without data are almost always wrong in may ways - e.g. SUSY was expected to be true for decades by most of the world's smartest physicists. Many ali... (read more)

1NicholasKees20m
What about leaning into the word-of-mouth sharing instead, and support that with features? For example, being able to as effortlessly as possible recommend posts to people you know from within LW?
2habryka18m
Not crazy. I also think doing things that are a bit more social where you have ways to recommend (or disrecommend) a post with less anonymity attached, allowing us to propagate that information further, is not crazy, though I am worried about that incentivizing more groupthinking and weird social dynamics.
2Charlie Steiner25m
Yeah, fair enough.

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

2MondSemmel37m
If the Github issue tracker is indeed not in use, then I find that very disappointing. Intercom may be a more reliable channel for reporting bugs than the alternatives (though even on Intercom, I've still had things slip through the cracks), but it can't replace an issue tracker. Besides, not all feedback constitutes bug reports; others require a back-and-forth, or input from multiple people, or follow-up to ask "hey, what's the status here?", and all of that works much better when it's asynchronous and in public, not in a private chat interface. And this very comment thread is a good illustration for why open thread comments also don't work for this purpose: they might not get noticed; the threads are kind of ephemeral; feedback is mixed with non-feedback; the original poster has no way to keep track of their feedback (I had to skim through all my recent comments to find the ones that were feedback); not everyone related to an issue gets notified when someone comments on the issue; if issues are discussed in disparate threads, there's no bi-directional crosslinking (if Github issue A links to B, then B displays the link, too); etc. Ultimately whatever tools the LW team use to manage the website development may work well for them. But when I want to help as an outsider, I feel like the tools I'm given are not up to snuff. It seems to me like a public issue tracker is an obvious solution to this problem, so I'm kind of incredulous that there isn't really one. What gives?
1kave34m
It's (as a descriptive fact) not a priority to support external contributions to the codebase. My guess is that it's also correct not to prioritise that.

I understand that that's obviously the counter-perspective, it just seems so wild to me. I'd love to see or do a dialogue on this, with anyone on the team where it would matter if they changed their mind on deprioritising this topic.

2habryka1h
I think the EA Forum uses it it actively. But the LW team doesn't at all.

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

Yes, it happened before for me as well. I think it would be good to have profile pictures to make it easier to recognize users.

2ChristianKl31m
It's good to use prediction markets in practice but most people who read the post likely don't get that much value from reading the post.  Larry McEnerney is good at explaining that good writing isn't writing that's cool or interesting but simply writing that provides value to the reader.  As far as the actual execution goes, it might have been better to create fewer markets and focus on fewer experiments, so that each one gets one attention.
2Viliam9h
At this moment, the post has 25 karma, which is not bad. From my perspective, positive karma is good, negative karma is bad, but 4x higher karma doesn't necessarily mean 4x better -- it could also mean that more people noticed it, more people were interested, it was short to read so more people voted, etc. So I think that partially you are overthinking it, and partially you could have made the introduction shorter (basically to reduce the number of lines someone must read before they decide that they like it).
2niplav9h
Yeah, when I posted the first comment in here, I think it had 14? I was maybe just overly optimistic about the amount of trading that'd happen on the markets.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

The Less Wrong General Census is unofficially here! You can take it at this link.

It's that time again.

If you are reading this post and identify as a LessWronger, then you are the target audience. I'd appreciate it if you took the survey. If you post, if you comment, if you lurk, if you don't actually read the site that much but you do read a bunch of the other rationalist blogs or you're really into HPMOR, if you hung out on rationalist tumblr back in the day, or if none of those exactly fit you but I'm maybe getting close, I think you count and I'd appreciate it if you took the survey.

Don't feel like you have to answer all of the questions just because you started...

Completed the survey, skipping only a few questions.

2Vanessa Kosoy10h
Imagine that, for every question, you will have to pay ϵln1p dollars if the event you assigned a probability p occurs. Here, ϵ>0 is some sufficiently small constant (this assumes your strategy doesn't fluctuate as ϵ approaches 0). Answer in the optimal way for that game, according to whatever decision theory you follow. (But choosing which questions to answer is not part of the game.)
1red75prime14h
What if all I can assign is a probability distribution of probabilities? Like in extraterrestrial life question. All that can be said is that extraterrestrial life is sufficiently rare to not find evidence of it yet. Our observation of our existence is conditioned on our existence, so it doesn't provide much evidence one way or another. Should I sample the distribution to give an answer, or maybe take mode, or mean, or median? I've chosen a value that is far from both extremes, but I might have done something else with no clear justification for any of the choices.

When people talk about prosaic alignment proposals, there’s a common pattern: they’ll be outlining some overcomplicated scheme, and then they’ll say “oh, and assume we have great interpretability tools, this whole thing just works way better the better the interpretability tools are”, and then they’ll go back to the overcomplicated scheme. (Credit to Evan for pointing out this pattern to me.) And then usually there’s a whole discussion about the specific problems with the overcomplicated scheme.

In this post I want to argue from a different direction: if we had great interpretability tools, we could just use those to align an AI directly, and skip the overcomplicated schemes. I’ll call the strategy “Just Retarget the Search”.

We’ll need to make two assumptions:

...

This post expresses an important idea in AI alignment that I have essentially believed for a long time, and which I have not seen expressed elsewhere. (I think a substantially better treatment of the idea is possible, but this post is fine, and you get a lot of points for being the only place where an idea is being shared.)