Wiki Contributions


I think I'm at least close to agreeing, but even if it's like this now, it doesn't mean that the complex-positive-value-optimizer can produce more value mass than simple-negative-value-optimizer.

On the definition question, in addition to what localdeity wrote:

  1. I assume that on the first axis we consider position "interested in two people" already pretty non-monogamous, while the position "my partner can have sexual or romantic relationships with anyone except one particular person" is still very poly. If that's the case, your position along "interested in one"/"interested in many" axis can be easily changed if the set of people you know changes, even slightly. This position isn't a fact about you, more like fact about you and your options in current environment. In contrast, your position along the "restricts partner/doesn't restrict partner" axis can't change much if your partner's environment slightly changes. So if we want the definition of something stable and identity-related using one axis, the second axis is better suited for this purpose.
  2. On practice (your experience may vary) the definition that uses axis "interested in one"/"interested in many" causes kinda missing-the-point arguments from monogamous people, like "I barely have time for one partner". I think, if the restriction based definition were generally accepted, the discourse would be better.

This totally makes sense! But "proteins are held together by van der Waals forces that are much weaker than covalent bonds" still is a bad communication.

As for 4 - even just remembering anything is a self modification of memory.

That's for humans, not abstract agents? Don't think it matters, we talk about other self-modifications anyway.

From your problem description

Not mine :)

utility on other branches

Maybe this interpretation is what repels you? Here's another 2:

  • You choose to behave like EDT-agent or like FDT-agent in the situations where it matters in advance, before you got into (1) or (3). And you can't legibly for the predictors like one in this game decide to behave like FDT agent, and then, in the future, when you got into (1) because you're unlucky, just change your mind. It's just not an option. And between options "legibly choose to behave like EDT-agent" and "legibly choose to behave like FDT-agent" the second one is clearly better in expectation. You just not make another choice in (1) or (2), it's already decided.
  • If you find yourself in (1) or (2) you can't differentiate between cases "I am real me" and "I am the model of myself inside predictor" (because if you could, you could behave differently in this two cases and it would be bad model and bad predictor). So you decide for both at once. (this interpretation doesn't work well for afents with explicitly self-indicated values (or how it is called? I hope it's clear what I mean.))

The earlier decision to precommit (whether actually made or later simulated/hallucinated) sacrifices utility of some future selves in exchange for greater utility to other future selves.

Yes. It's like choose to win on a 1-5 on a die roll rather then win on a 6. You sacrifice utility if some future selves (in the worlds, when die roll 6) in exchange for greater utility to other future selves, and it's perfectly rational.

We can also construct more specific variants of 5 where FDT loses - such as environments where the message at step B is from an anti-Omega which punishes FDT like agents.

Ok, yes. You can do it with all other types of agents too.

But naturally a powerful EDT agent will simply adopt that universal precommitment if when it believes it is in a universe distribution where doing so is optimal!

I think the ability to legibly adopt such precommitment and willingness to do so kinda turns EDT-agent into FDT-agent.

Well, yes, it loses in (1), but it's fine, because it wins in (4) and (5) and is on par with EDT-agent in (3). (1) is not the full situation in this game, it's always a consequence of (3), (4) or (5), depending on interpretation, the rules don't make sense otherwise.

PS. If FDT-agent is suddenly teleported into situation (1) in place of some other agent by some powerful entity who can deceive the predictor and the predictor predicted the behaviour of this other agent who was in the game before, and FDT-agent knows all this, it obviously takes $1, why not?

FDT-agent obviously never decide "I will never ever take $1 from the box". It decides "I will not take $1 in the box if the rules of the situation I'm in are like <rules of this game>".

Only it's more general, something like "When I realise that it would be better if I made some precommitment earlier, I act like I would act if I actually made it" (not sure that this phrasing is fully correct in all cases).

EDT-agent in (5) goes in (1) with 99% probability and in scenario (2) with 1% probability. It wins 99%*$1+1%*$100=$1.99 in expectation.

FDT-agent in (5) goes in (1) with 1% probability and in scenario (2) with 99% probability. It wins 1%*$0+99%*$100=$99 in expectation.

IMO, to say that FDT-agent loses in (1) and therefore it is inferior to EDT-agent is like say that it's better to choose to roll a die with win on 6 then to roll a die with win on 1-5 because this option is better in the case where a die rolls 6.

In what exact set of alternate rules EDT-agent wins more in expectation?

FDT outperforms EDT on

4. You are about to observe one of [$1, $100] in a transparent box, but your action set doesn't include any self-modifications or precomittments.

5. You are about to observe one of [$1, $100] in a transparent box, but you don't know about it and will know about the rules of this game only when you will already see the box.

If the best the EDT-agent can do is precommit to behave like FDT-agent or self-modify itself into FDT-agent, it's weird to say that EDT is better :)

Load More