Newcomb’s Centipede

The following puzzle is a cross between the Newcomb puzzle and the centipede game.

You have to pick a number between 1 and 50, call that number u. A demon, who is exceptionally good at predictions, will try to predict what you pick and will pick a number d, between 1 and 50, that is 1 less than u. If she predicts u is 1, then she can’t do this, so she’ll pick 1 as well. The demon’s choice is made before your choice is made, but only revealed after your choice is made. (If the demon predicts that you’ll use a mixed strategy to choose u, she’ll set d equal to 1 less than the lowest number that you have some probability of choosing.)

Depending on what numbers the two of you pick, you’ll get a reward by the formula below.

If u is less than or equal to d, your reward will be 2u.
If u is greater than d, your reward will be 2d – 1.

For an evidential decision theorist, it’s clear enough what you should do. Almost certainly, your payout will be 2u – 3, so you should maximise u, so you should pick 50, and get a payout of 97.

For a causal decision theorist, it’s clear enough what you should not do. We know that the demon won’t pick 50. If the demon won’t pick 50, then picking 49 has a return that’s better than picking 50 if d = 49, and as good as picking 50 in all other circumstances. So picking 49 dominates picking 50, so 50 shouldn’t be picked.

Now the interesting question. What should you pick if you’re a causal decision theorist. I know of three arguments that you should pick 1, but none of them sound completely convincing.

Backwards Induction
The demon knows you’re a causal decision theorist. So the demon knows that you won’t pick 50. So the demon won’t pick 49; she’ll pick at most 48. If it is given that the demon will pick at most 48, then picking 48 dominates picking 49. So you should pick at most 48. But the demon knows this, so she’ll pick at most 47, and given that, picking 47 dominates picking 48. Repeating this pattern several times gives us an argument for picking 1.

I’m suspicious of this because it’s similar to the bad backwards induction arguments that have been criticised effectively by Stalnaker, and by Pettit & Sugden. But it’s not quite the same as the arguments that they criticised, and perhaps it is successful.

Two Kinds of Conditionals
In his very interesting The Ethics of Morphing, Caspar Hare appears to suggest that causal decision theorists should be sympathetic to something like the following principle. (Caspar stays neutral between evidential and causal decision theory, so it isn’t his principle. And the principle might be slightly stronger than even what he attributes to the causal decision theorist, since I’m not sure the translation from his lingo to mine is entirely accurate. Be that as it may, this idea was inspired by what he said, so I wanted to note the credit.)

Say an option is unhappy if, supposing you’ll take it, there is another option that would have been better to take, and an option is happy if, supposing you take it, it would have been worse to have taken other options. Then if one option is happy, and the others all unhappy, you should take the happy option.

Every option but picking 1 is unhappy. Supposing you pick n, greater than 1, the demon will pick n-1, and given that you would have been better off picking n-1. But picking 1 is happy. Supposing that, the demon will pick 1, and you would have been worse off picking anything else.

There’s something to the pick happy options principle, so this argument is somewhat attractive. But this does seem like a bad consequence of the principle.

Stable Probability
In Lewis’s version of causal decision theory, we have to look at the probability of various counterfactuals of the form If I were to pick n, I would get k dollars. But we aren’t really told where those probabilities come from. In the Newcomb problem that doesn’t matter; whatever probabilities we assign, two boxing comes out best. But the probabilities matter a lot here.

Now it isn’t clear what constrains the probabilities in question, but I think the following sounds like a sensible constraint. If you pick n, the probability the demon picks n-1 (or n if n=1) should be very high. That’s relevant, because the counterfactuals in question (what would I have got had I picked something else) are determined by what the demon picks.

Here’s a constraint that seems plausible. Say an option is Lewis-stable if, conditional on your picking it, it has the highest “causally expected utility”. (“Causally expected utility” is my term for the value that Lewis thinks we should try to maximise.) Then the constraint is that if there’s exactly one Lewis-stable option, you should pick it.

Again, it isn’t too hard to see that only 1 is Lewis-stable. So you should pick it.

It seems intuitively wrong to me to pick 1. It doesn’t dominate the other options. Indeed, unless the demon picks 1, it is the worst option of all. And I like causal decision theory. So I’d like a good argument that the causal decision theorist should pick something other than 1. But I’m worried (a) that causal decision theory recommends taking 1, and (b) that if that isn’t true, it makes no recommendation at all. I’m not sure either is a particularly happy result.

10 Replies to “Newcomb’s Centipede”

  1. I think the latter two arguments fall victim to an case that’s like the one that Anil Gupta raised against similar proposals by Andy Egan. Consider the following game:

    You have three choices: 1, 48, and 49. The demon has three choices: 1, 48, and 49. If you and the demon choose the same number, you get the sum of your choices; if you choose numbers that are one different from each other, you get the lower number; if you choose numbers that are not within one of each other, you get nothing. If the numbers you choose are one different from each other, the demon gets 100; if they’re the same, the demon gets the sum; if they’re not within 1, the demon gets nothing.

    That is to say, if the demon predicts you’ll pick 48 or 49, it’ll pick the other, and you’ll get 48. If the demon predicts you’ll pick 1, it’ll pick 1, and you’ll get 2. If you picked something else while the demon picked 1, you’d get 0. If you picked the same number as the demon in the high 40s, you get something in the 90s. If you pick 1 while the demon picks in the 40s, you get 0.

    Now, it seems to me (though I may have misunderstood the setup) that 48 and 49 are unhappy and Lewis-unstable. If you pick 48, you think that the demon will pick 49, and you’d have been better off picking 49; and vice versa. Picking 2 is Lewis-stable; you think the demon will pick 1, and that means you get 2, while picking in the 40s would get you 0. But even a causal decision theorist shouldn’t pick 1. So causal decision theorists should reject the principles about happiness and Lewis-stability.

    For a slightly rigorous version of the argument that the causal decision theorist shouldn’t pick 1, consider the version of the game in which the demon’s payouts are unchanged, but if both picks are in the 40s then you just get 40. Now picking in the 40s is happy and Lewis-stable, and I think (?) that the causal decision theorist is OK saying that you should pick in the 40s. (It doesn’t matter which number in the 40s you pick.) But the only difference between this game and the original one is that in the original one we increase the payouts for picking in the 40s. So if picking the 40s is rational in the revised game, it should be rational in the original game.

  2. …if I’m reading Caspar Hare correctly, he doesn’t propose your happy/unhappy principle for the causal decision theorist. What he propose is that if A would be better than B if you actually pick B, and B would be worse than A if you actually pick A, then you shouldn’t pick B.

    Your principle is a weaker version of this: if there is some C that would be better than A if you actually pick A, and some B such that there is no D that would be better than B if you actually pick B, then don’t pick A. The difference is that in your principle, there’s no guarantee that it’s B that’s better than A if you actually pick A; there could be some other witness.

    On Hare’s principle (if I’ve read it correctly), my example doesn’t go through; 1 isn’t better than 48 if you pick 48 (it’s 49 that’s better if you pick 48).

    You could try to turn Hare’s principle into an argument for 1 like this: Compare 50 and 49 and knock out 50; compare 49 and 48 and knock out 49; etc. But it won’t work as written, I think, because if you actually choose 49 then 50 won’t yield a worse result than 49, it’ll yield the same result (demon picks 48, you get 95 whether you pick 49 or 50). You could fix this by turning Hare’s principle into a sort of weak dominance thing; I’m not sure whether the causal decision theorist would find that new principle attractive.

  3. …actually (apologies for the triple post, I’m thinking as I post), the principle I proposed isn’t Hare’s either. In the relationship I described between A and B in my first paragraph, A is pairwise superior to B. He says (roughly) that if there’s a Z such that nothing is pairwise superior to B, and if moving repeatedly to pairwise superior options always gets you to Z no matter what your starting point is and no matter how you do it, pick Z. That last part is necessary to avoid path dependence.

    But this, combined with the weak dominance modification of pairwise superiority, doesn’t give an argument for picking 1. Because once you get down to 48, 50 turns out to be weakly pairwise superior to 48:

    If you actually pick 48 then picking 50 would’ve been just as good (demon picks 47, either way you get 93).

    If you actually pick 50, then picking 48 would be inferior (demon picks 49, if you pick 50 you get 97, if you pick 48 you get 96).

    The point being that just because B is [weakly] pairwise superior to A doesn’t mean that you should eliminate A from consideration completely, because [weak] pairwise superiority isn’t transitive, so there’s the risk of path dependence.

    OK, my final verdict for now:
    Hare’s actual principles don’t provide an argument for picking 1;
    the principle in the first paragraph of my second comment is bad (at least in the weak dominance version), because it can eliminate all options;
    the happy/unhappy principle you formulated seems to fall victim to the Gupta case, as in the first comment.

  4. Hi Matt

    I think I agree that the happy/unhappy principle does fall victim to the Gupta example, so I’m glad that argument is off the table!

    You’re also right that I’m using weak not strong dominance in the Hare-inspired argments. I don’t think that’s a bad thing actually – I don’t really see any motivation for accepting any strong dominance principle in the neighbourhood that wouldn’t carry across to weak dominance.

    But the more interesting thing is the path dependence argument. I think there’s still a way to motivate picking 1.

    By weak dominance, picking 49 is preferable to picking 50. Again by weak dominance, picking 48 is preferable to picking 49. Now there isn’t a weak dominance argument for picking 48 over picking 50. But there doesn’t have to be. Because given what we’ve already derived, we can just transitivity of preferability to say that picking 48 is preferable to picking 50. So weak dominance alone won’t get us to 1, but weak dominance plus transitivity might.

  5. . Now there isn’t a weak dominance argument for picking 48 over picking 50. But there doesn’t have to be. Because given what we’ve already derived, we can just transitivity of preferability to say that picking 48 is preferable to picking 50.

    But there’s actually a cycle, isn’t there? 50 weakly dominates 48 (in the sense we’re talking about). I was very unclear about this in comment 3 — what I meant was not that weak pairwise superiority isn’t transitive, but that it leads to cycles; the ancestral of the relation isn’t irreflexive. Which means we can’t move straight from weak pairwise superiority to preferability.

  6. Hi, great puzzle!

    Wlodek Rabinowicz has also argued against ratifiability and stability constraints such as the ones you use in arguments 2 and 3. At any rate, I think we shouldn’t take them to be part of Causal Decision Theory. As far I remember, having maximal causally expected utility and being ratifiable (or stable) are independent properties: all four combinations are possible. Hence adding these constraints to CDT would actually mean to revise CDT.

    What CDT without any added constraints recommends in the puzzle depends on your credence in what you were predicted to do. For instance, if you give credence 0.5 to both d=49 and d=48, then your best choice is u=49 with an expected utility of 96.5, compared to 96.0 for both u=48 and u=50, and even less for all other choices.

    Any argument that you should not pick 49 must therefore be an argument that you should not give credence 0.5 to d=49 and d=48. And any argument that you should pick 1 must be an argument that you should assign credence > 2/3 to d=1.

    We get such an argument from the dynamics or rational deliberation (which is probably just a more elaborate form of your argument 1). Suppose you begin your deliberation in a certain state of indecision — i.e. with a certain probability distribution over your options. This determines a probability distribution over the predictor’s choice of d. From that, you can calculate the expected utility of each option. The result should cause a redistribution in your probabilities. In particular, if you realise that an option you gave rather low credence has greater expected utility than options you gave high credence, you should become more certain that you will choose the option with the greater expected utility. The new probabilities will determine new utilities, causing further revisions, etc. Under plausible assumption for the redistribution rule (if the rule does what Skyrms calls ‘seeking the good’), it turns out that no matter what distribution you start with, you end up being certain that you will pick u=1.

    (I didn’t fully prove this, but I ran a little script to try out a few initial distributions. Interestingly, the predictor’s accuracy has to be very high, 99.9% is not enough.)

    So it looks like 1) certain principles of rational deliberation together with CDT recommend that you assign credence > 2/3 to d=1; and 2) given these credences, CDT recommends choosing u=1.

    With respect to (2), I think CDT has it right: if you’re certain the predictor has chosen d=1, you really should choose u=1. If we think there’s something wrong about u=1, we’ll have to block step (1).

    As far as I can see, the only way to do this is to reject the constraint on deliberations: sometimes it is okay to remain confident that you will do something even when you realize that something else would have higher expected utility.
    For instance, it is okay to remain confident that you will choose 49 even if you realize (based on this confidence) that 48 has a higher expected payoff (96, vs. 95). If you can rationally retain this confidence until the end of deliberation and you follow CDT, you will then choose 48.

    An alternative diagnosis might be that your setup punishes reflective rationality. The norms of rational deliberation are epistemic norms, so maybe we should expect there to be cases where following these norms will have bad practical outcomes.

    BTW, I thought Lewis introduced dependency hypotheses in part to avoid Gibbard and Harper’s assignment of probabilities to counterfactuals.

  7. @Matt,

    You’re right of course that weak dominance, as I was using it, leads to violations of transitivity. That’s a reason to not use weak dominance as a grounds for preference obviously.

    I wonder if tweaking the example a little might make the case for 1 a little stronger (while at the same time making the intuition that you shouldn’t pick 1 a little weaker).

    Say that the payouts are in dollars and cents. The dollar payouts are described as above. But if you pick n, you have to pay in n cents. So if you pick 47 and the demon picks 45, you get $89 (as described above) less 47 cents, for a total of $88.43.

    Now if A is one less than B, then picking A strongly dominates B. That is, supposing you pick A, it would be better to pick A than B. (Either way the demon will pick A – 1, and you’ll start with 2A – 3, but picking A will cost you a penny less in penalty.) And supposing, you pick B it would have been better to pick A, because now you’ll get an extra dollar.

    If A is two (or more) less than B, then neither option dominates the other. Supposing you pick A, then picking B would have just led to a higher penalty.

    So now strong dominance plus transitivity gives you that 1 is the preferable option. But the example is now rather complicated, and it might not be that this is the best test of a theory of decision.


    I agree entirely that CDT has in right about step (2) in your argument. If you’re confident that you’ll pick 1, then you should pick 1, because anything else would leave you worse off. But I’ll need to sit down with a notepad (or a good script!) before I have any firm opinions about (1). I think that’s a very helpful way to think about the problem though…

  8. Brian,
    I’m pretty sure that strong pairwise dominance alone won’t be enough for a decision criterion — there are enough payouts to play with that we ought to be able to create a game where it produces cycles too. But I’m also pretty sure that in your new game, 1 is a stable attractor in Hare’s sense, so we do have a Hare-y argument that CDT leads to picking 1 in this game.

    I don’t really find picking 1 any more appealing in this game, but I’m mostly a one-boxer anyway, so I might not be the best person to ask.

Leave a Reply