There's one thing history seems to have been trying to teach us: that the contents of the future are determined by power, economics, politics, and other conflict-theoritic matters.
Turns out, nope!
Almost all of what the future contains is determined by which of the two following engineering problems is solved first:
…and almost all of the reasons that the former is currently a lot more likely are mistake theory reasons.
The people currently taking actions that increase the probability that {the former is solved first} are not evil people trying to kill everyone, they're confused people who think that their actions are actually increasing the probability that {the...
I was focusing on what the [containable, uncontainable] continuum of possibilities means.
Ahh, you mean, what's expected utility of having a controlled AGI of power X vs. expected loss of having a rogue AGI of the same power? And how does the expected payoff of different international strategies change as X gets larger?
Hm. Let's consider just A's viewpoint, and strategies of {steady progress, accelerate, ban only domestically, ban internationally}.
I have published a paper in the Journal of Artificial Intelligence and Consciousness about how to take into account the interests of non-human animals and digital minds in A(S)I value alignment.
For the published version of the paper see: https://www.worldscientific.com/doi/10.1142/S2705078523500042
For a pdf of the final draft of the paper see: https://philpapers.org/rec/MORTIA-17
Below I have copy-pasted the body of the paper, for those of you who are interested, though please cite the published version at: https://www.worldscientific.com/doi/10.1142/S2705078523500042
Cross-Posted at the EA Forum: https://forum.effectivealtruism.org/posts/pNHH953sgSConBmzF/taking-into-account-sentient-non-humans-in-ai-ambitious
Abstract: Ambitious value learning proposals to solve the AI alignment problem and avoid catastrophic outcomes from a possible future misaligned artificial superintelligence (such as Coherent Extrapolated Volition [CEV]) have focused on ensuring that an artificial superintelligence (ASI) would try to do what humans would want it to do. However, present...
Given the importance of the word "sentient" in this Sentientist Coherent Extrapolated Volition proposal, it would have been helpful if you had clearly defined this. You make it clear that your definition of this includes non-human animals, so evidently you don't mean the same thing as "sapient". In a context including animals "sentient" is most often used to mean something like "capable of feeling pain, having sensory impressions, etc," That doesn't have a very clear lower cutoff (is an amoeba sentient?), but would presumably include, for example, ants, wh...
Hello! This is jacobjacob from the LessWrong / Lightcone team.
This is a meta thread for you to share any thoughts, feelings, feedback or other stuff about LessWrong, that's been on your mind.
Examples of things you might share:
...or anything else!
The point of this thread is to give you an affordance to share anything that's been on your mind, in a place where you know that a team member will be listening.
(We're a small team and have to prioritise what we work on, so I of course don't promise to action everything mentioned here. But I will at least listen...
Thanks for you feedback. I certainly appreciate your articles and I share many of your views. Reading what you had to say, along with Quentin, Jacob Cannell, Nora was a very welcome alternative take that expanded my thinking and changed my mind. I have changed my mind a lot over the last year, from thinking AI was a long way off and Yud/Bostrom were basically right to seeing that its a lot closer and theories without data are almost always wrong in may ways - e.g. SUSY was expected to be true for decades by most of the world's smartest physicists. Many ali...
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
I understand that that's obviously the counter-perspective, it just seems so wild to me. I'd love to see or do a dialogue on this, with anyone on the team where it would matter if they changed their mind on deprioritising this topic.
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
Yes, it happened before for me as well. I think it would be good to have profile pictures to make it easier to recognize users.
The Less Wrong General Census is unofficially here! You can take it at this link.
It's that time again.
If you are reading this post and identify as a LessWronger, then you are the target audience. I'd appreciate it if you took the survey. If you post, if you comment, if you lurk, if you don't actually read the site that much but you do read a bunch of the other rationalist blogs or you're really into HPMOR, if you hung out on rationalist tumblr back in the day, or if none of those exactly fit you but I'm maybe getting close, I think you count and I'd appreciate it if you took the survey.
Don't feel like you have to answer all of the questions just because you started...
When people talk about prosaic alignment proposals, there’s a common pattern: they’ll be outlining some overcomplicated scheme, and then they’ll say “oh, and assume we have great interpretability tools, this whole thing just works way better the better the interpretability tools are”, and then they’ll go back to the overcomplicated scheme. (Credit to Evan for pointing out this pattern to me.) And then usually there’s a whole discussion about the specific problems with the overcomplicated scheme.
In this post I want to argue from a different direction: if we had great interpretability tools, we could just use those to align an AI directly, and skip the overcomplicated schemes. I’ll call the strategy “Just Retarget the Search”.
We’ll need to make two assumptions:
This post expresses an important idea in AI alignment that I have essentially believed for a long time, and which I have not seen expressed elsewhere. (I think a substantially better treatment of the idea is possible, but this post is fine, and you get a lot of points for being the only place where an idea is being shared.)