Newcomb and Mixed Strategies

In a nice paper in a recent Philosophical Review Alan Hajek argued that Pascal’s argument in the Wager fails because he doesn’t take account of mixed strategies. I’ve been spending too much of today wondering whether the same thing is true in other fields. (Not that I’m entirely convinced by Hajek’s argument, but the response would take another post, and historical research, and that’s for another week.)

For a while I thought mixed strategies could solve some of the problems Andy Egan discusses in his paper on causal decision theory. Maybe they can, but I’m not so sure. For now I just want to discuss what they do to Nick Bostrom’s Meta-Newcomb Problem.

The first thing to say is that it’s hard to say what they’d do, because Bostrom doesn’t say what his predictors do if they predict you’ll use a mixed strategy. I’ll follow Nozick and say that if they predict a mixed strategy, that’s the same as predicting a 2-box choice. Importantly I make this assumption both for Bostrom’s predictor and his Meta-Predictor. But if the “Predictor” is not predicting, but is in fact reacting to your choice (as is a possibility in Bostrom’s game) then I’ll assume that what matters is what choice you make, not how you make it. So choosing 1 box by a mixed strategy will be the same as choosing 1 box by a pure strategy for purposes of what causal consequences it has.

Given those assumptions, it sort of seems that the “best” thing to do in Bostrom’s case is to adopt a mixed strategy with probability e of choosing 2 boxes, for vanishingly small e. That will mean that if the meta-predictor is “right” your choice will cause the predictor to wait until you’ve made your decision, and with probability 1 less a vanishingly small amount, you’ll get the million. (Scare quotes because I’ve had to put an odd interpretation on the MetaPredictor’s prediction to make it make sense as a prediction. But this is just in keeping with the Nozickian assumptions with which I started.)

Problem solved, at least under one set of assumptions.

Now I had to set up the assumptions about how to deal with mixed strategies in just the right way for this to work. Presumably there are other ways that would be interesting. I’m not interested in games where predictors are assumed to know the outputs of randomising devices used in mixed strategies. That seems too much like backwards causation. But there could be many other assumptions that lead to interesting puzzles.

UPDATE: Be sure to read the many interesting comments below, especially Bob Stalnaker’s very helpful remarks.

11 Replies to “Newcomb and Mixed Strategies”

  1. I don’t buy the assumption. If the Meta-Predictor has an excellent track record at predicting your choices, and can predict that you will adopt a strategy of two-boxing with tiny probability, then the MP should predict that you will one-box. Or should say that you will adopt a mixed strategy; the Predictor has to choose one option or the other by the terms of the game. But the MP can say anything (or not speak at all), and so there’s no reason to think that the MP is forced to say “You will two-box” if in fact you’re carrying out a mixed strategy.

    Put another way: Why is Nozick’s assumption necessary for the Predictor? Suppose that the problem is set up so that the Predictor accurately predicts your actual choice. If you adopt a mixed strategy, the Predictor has to be able to accurately predict the outcome of your randomizer; and that would be metaphysically impossible if it’s a true randomizer. So mixed strategies make the problem’s setup incoherent, unless we adopt Nozick’s assumption for the Predictor.

    But there’s no incoherence built into the setup if we don’t adopt Nozick’s assumption for the MetaPredictor—though we could ask, in the case Bostrom describes, how the MetaPredictor knows you won’t adopt a half-and-half mixed strategy.

  2. I agree there’s a difference between the two cases. but for the MP to have the knowledge Bostrom ascribes, she must do something funky in the case of mixed strategies. Just what it is will determine how hard a puzzle we face.

  3. Of course you’re right that Nick’s original version of his “Meta-Newcomb” problem doesn’t say enough about mixed strategies. (I must ask Nick about this if I see him around later this week.) But presumably, what we should try to do is to make the puzzle as hard as possible for the causal decision theorist.

    So how about this: let’s just regard an outright choice to be a one-boxer as the limiting case of a mixed strategy that assigns a greater probability to one-boxing than to two-boxing (a “one-boxing-favouring strategy” as I’ll call it). In the first case, the Predictor will put a million dollars in box B iff he predicts that you will adopt such a one-boxing-favouring strategy; in the second case, the Predictor will put a million dollars in box B iff he observes you adopting such one-boxing-favouring strategy. The Meta-Predictor lets you know the following: either you will not adopt any such one-boxing-favouring strategy and the Predictor will make his move after you make your choice; or else you will adopt such a one-boxing-favouring strategy and Predictor has already made his choice.

    Then if you’re a causal decision theorist, you still seem to be caught in Bostrom’s dilemma. If you think you will not adopt a one-boxing-favouring strategy, then you have reason to think that your choice will causally influence what’s in the boxes, and hence that you ought to adopt a one-boxing-favouring strategy. But if you think you will adopt a one-boxing-favouring strategy, then you should think that your choice will not affect the contents of the boxes, and thus you would be led back to rejecting a one-boxing-favouring strategy; and so on ad infinitum.

  4. I think part of what I’m worried about may depend on who Bostrom’s setup is supposed to apply to. In the straight Newcomb paradox, there’s a game with rules that apply to everyone. That is, for all X (who the Predictor is confronted with), the Predictor will put $1 mill in the box iff it predicts that X will one-box, modulo whatever asssumption* we make to deal with mixed strategies.

    Now, in Bostrom’s setup, I don’t think it’s part of the game that for all X the MetaPredictor will say anything at all. So, just this once, the MetaPredictor predicts that my choices will conform to its prediction; it may be able to do this because it knows that I won’t choose the mixed strategy. Which in fact is true; I’ll always one-box.

    Of course this doesn’t solve the problem if we’re talking about someone else, who’s considering a mixed strategy. So probably we need to specify some way of treating mixed strategies if we want to make the problem coherent. Ralph seems right that we can specify the problem in such a way that it causes trouble for CDT with mixed strategies.

    (The reason I’m going on in this way is that I have a suspicion that the Newcomb paradox depends on failure to specify certain details, and that specifying the details will make it less problematic. For instance, 2-boxers often tell me that evidential decision theory says you should 1-box even if the Predictor is right only 75% of the time. But this raises a question how the Predictor can be right so often.

    What if the Predictor tells you: “I’m really good at predicting whether someone is going to hesitate about whether to 1-box or 2-box, or whether they’re just going to go for one choice or the other without even considering the other alternative. If I think they’re going to go straight for one choice, I act appropriately. If I think they’re going to hesitate, I never put the million dollars in.” This is compatible with the Predictor being right 75% of the time; but in this case, EDT says you should 2-box if you’re wondering what to do, since you’ve lost the million no matter what. And even if all I know is that the Predictor is right 75% of the time, I’d think something like this would be the most likely explanation of what’s going on.

    Anyway, I should talk about this on my own blog.)

  5. I agree this doesn’t solve all the problems and the CDT theorist needs to say more.. I have a half-baked thought about what the source of the problem is here.

    CDT folks treat active propositions (props about the part of the world we control) very differently from passive propositions (props about parts of the world we don’t). But they don’t have a great story about what to do with mixtures of these kinds, e.g. truth’functional compounds of active and passive. That’s what we need to solve Nick’s puzzle in all its variants, and Andy’s puzzle, and DBM’s puzzle that Andy quotes, and many more besides. And I don’t have much idea about how to do this.

  6. Three small points, and a parable:

    1. It is normally assumed, in Newcomb-like problems, that the predictor is right a large percentage of the time, but not always. In Bostrom’s meta-Newcomb, as I read it, and compute it, there won’t be a problem so long as the meta-predictor is right no more than 99% of the time (and assuming that utility is proportional to money). Both the causal and the evidential decision theorist will prescribe the one-box choice, since even on the hypothesis that one box is chosen, the small probability that the choice will influence the putting of the money in the box will be enough to recommend that choice. Of course one could play with the numbers to change this.

    2. On mixed strategies: the standard way around letting mixed strategies muddy the waters in Newcomb-like choices is to stipulate that if the predictor predicts a mixed strategy choice, she puts no money in the box (or more generally, in variations on the problem, arranges things so that the outcome is bad for the agent).

    3. I am puzzled by the new flurry of interest in variations on the Newcomb problem that result in paradox (for causal decision theory), since such variations were discussed in Nozick’s original paper, and in Gibbard-Harper (the death in Damascus case). Paul Horwich, in Probability and Evidence used such cases to argue against causal decision theory. Is there something new in the examples currently being discussed that I am missing? (Perhaps what is new is the misconception that, in cases such as Andy Egan’s murder lesion case, causal decision theory gives definite unconditional advice about what to do. In fact, what CDT says, about such cases, is that the agent acts irrationally, given the beliefs she would then have, whatever she does. CDT does not advise one to ignore the evidence that derives from one’s knowledge of what one is intending to choose, or has decided to choose. It would be irrational, in applying any version of decision theory, to ignore relevant evidence.)

    Paradoxical cases are puzzling, but I don’t see that they count against causal decision theory. That theory (like evidential decision theory) tells you what do, given your utilities and degrees of belief. In the case of causal decision theory, the relevant beliefs are restricted to beliefs about aspects of the state of the world that are beyond your control (to dependency hypotheses, in Lewis’s terminology),but there is no restriction on the evidence that is relevant to beliefs of that kind. If your beliefs about such matters are unstable, oscillating back and forth as you take account of your beliefs about what you intend to do, then the prescriptions of the decision theory may also be unstable. The result may be that whatever you do, it will be irrational, relative to the beliefs about the state of the world beyond your control that you will then have. It is hard to know what to do in such a situation; I think it is a point in favor of a theory that yields no stable prescription in such cases.

    4. An abstract story: You have an advisor – an oracle – who is very wise, and who has an excellent track record, to whom you often go for help with your difficult decision problems. But the oracle worries that you are relying too much on her, and not thinking enough for yourself. And she has a bit of a sense of humor. So one day, when you come to her with a particularly complicated and difficult problem, she says, “I can’t help you with this one, except to give you the following warning. I am 95% sure that you will, in the end, make the wrong choice.” You go away, discouraged, but think hard about all the relevant considerations, and eventually conclude that A, rather than B, is clearly the better of the two available options. But, you think, ”I’ve got to trust my oracle, and so if I choose A, I should be 95% sure that it is the wrong choice. So I had better choose B. But wait – that will all the more clearly confirm the oracle’s prediction, since if I choose B, then the oracle’s warning will imply that I was probably right the first time. Still, if I go back to A, I am still stuck with the nagging belief that it is the wrong choice, and isn’t it indeed the wrong choice if I believe, as I would if I made that choice, that it will probably lead to a worse outcome than the alternative?” Mixed strategies to the rescue! You decide to flip a coin – Heads, A is the choice, tails it is B. Now the story has two alternative endings, but let us go with the happy ending: the coin lands heads, action A is taken, and all works out well. “You lucked out.” says the oracle, “You are like the person who drove home from the party drunk, without bothering to buckle his seatbelt, but didn’t get arrested, or crack up the car. But as I expected, you made a stupid choice. What is the point of randomizing between the better and the worse option?”

    So what should you have done? The obvious: think hard, do your best to take account of all the relevant considerations you can think of, and hope that this is one of the 5% of cases where the oracle is wrong. But expect the worst. I think the same kind of boring advice applies in the case of paradoxical Newcomb-like problems. Decide what you believe about things beyond your control, and then maximize expected utility, relative to those beliefs. In the paradoxical cases, you will regret your decision, since on reflection on what you have chosen, your beliefs about the state of the world beyond your control will change, and if your mind is quicker than your hand, they will change even before you have acted. Too bad, but it won’t make things better to adopt a version of decision theory that gives a definite prescription, particularly if the prescriptions it gives are clearly wrong in other cases.

  7. The reason I think that these cases are trouble for CDT is that CDT does give definite, unconditional advice about what to do, given a certain pattern of credences and preferences, and the advice it gives seems to me to be wrong. Of course it’s true that what CDT tells you to do will change as your credences and preferences change, but I suspect that this is not enough of a defense, for two reasons:

    For one thing, if it’s an acceptable defense of CDT that it will deliver the intuitively correct advice (or an appropriately ambivalent verdict) later, then it looks like evidential decision theory can be rescued from the apparent counterexamples that motivate abandoning it in favor of CDT in the same way.

    For another, it still leaves CDTers acting in ways that seem irrational in cases where the hand is quicker than the mind – where the later changes in belief are unable to avert the action that was chosen, on CDT’s endorsement, under the original pattern of credences and desires.

  8. A couple of things about the parable.

    1. It seems characteristically oracular for the oracle to say, “As I expected, you made a stupid choice”—in that her prediction is fulfilled, but not in the way you might have expected. You don’t make the wrong choice between A and B; you make the wrong choice between how to choose A and B. But if you were smart enough to realize that that counted as fulfilling her prophecy, you would never have been tempted to reason as you did. Also, if you have the suspicion that the oracle is pranking you in an effort to think for yourself, you may be most rational not to believe in her prophecies at all….

    It seems to me that the case that really causes trouble is one in which the oracle’s prophecy is to be interpreted as, “I am 95% sure that if the right choice is A, you will do B, and vice versa.” And for that to have a chance, the oracle had better be 90% sure that you won’t follow the coin-flipping strategy. It seems to me that it would and should be hard to incorporate that information into your decision.

    2. It seems to me that, if you’re in a situation where you have good reason to believe that you’re going to make the wrong decision, you should maximin. Try not to do anything that you won’t be able to fix later. Of course, as Ford Prefect said about the Garden of Eden, if you’re dealing with the sort of person who leaves hats on the pavement with bricks under them waiting for people to kick them, you’re going to be doomed no matter what you do, but in more realistic cases you may be able to hunker down until your reasoning faculties return or you can get constructive advice or something. (I’m not sure if I have a way of putting that in terms of CDT vs. EDT.)

  9. I have been wondering about “unstable” decision situations like the Meta-Newcomb problem and favor the response the Stalnaker gives above (Decide what you believe about things beyond your control, and then maximize expected utility, relative to those beliefs.) But I wonder why he says that CDT calls this an irrational choice. It is true that you would bet that you probably made the wrong decision in cases like the oracle one, but I don’t see a belief like that necessary meaning that you did anything irrational.

    Imagine a different scenario. There are two boxes green and red and you get to choose one of them and keep its contents. A predictor who doesn’t like you says that he will put $100 in the box that he thinks you won’t pick. No matter his decision, you get an additional $10 just for choosing the green box. Now imagine that you believe this predictor to be highly reliable (say 99% no matter your choice). It seems clear to me that the rational thing to do is to take the green box and expect to get a total of $10. After I made my choice but before I looked in the box, if you asked me to bet on where the $100 is, I would bet that it is in the red box. But I don’t see how that makes my choice of the green box irrational. Given the beliefs that I have about the predictor, I would expect to make even less $0) if I had picked the red box so why would that be the rational decision?

    To me, mixed decisions (like flip a coin and choose green iff heads comes up) don’t really seem to solve the problem. Believing that the predictor wouldn’t be accurate in this scenario simply seems to deny the premise that the predictor was reliable in the first place. Maybe the problem is just underspecified, but I think to get to the heart of the theory, we can’t rely on mixed strategies.

    An additional problem with mixed strategies is that they bring up the problem of whether or not you should follow your randomizing strategy once you flip the coin. It seems to me that after flipping the coin you still have the choice of deciding to take the green box or the red box. Perhaps you think this is now simply a matter of deciding to follow your previous randomizing strategy once you see the results or not. But you still have this choice. Of course then maybe the right thing to do is to have somebody else choosing for you, etc.

    One thing that does bother me about my own answer is that we can be in real-life scenarios where we might think we make the wrong choice but where there is no reason to hold fixed our belief about the reliability of the source. Rock, scissors, paper where you seem to be losing more often might provide some instances of this. Is there ever a reason to alter your first instinct? Is it always best to simply attempt to randomize your choice? I think these kinds of situations might call for some revising of plans giving what your previous plans were, etc.

Comments are closed.