I was reading Andrew Bacon’s paper A Paradox for Infinite Decision Makers, and while I agreed with a lot of the conclusions, I didn’t agree with one of the arguments. The argument concerns this case,

For each n ∈ ω, at 1/n hours past 12 pm Alice and Bob will play a round of the game. A round involves two moves: firstly Alice chooses either 1 or 0, and then Bob makes a similar choice. The moves are made in that order, and both players hear each choice. Alice wins the round if Bob’s choice is the same as hers, and Bob wins if his choice is different. The game finishes at 1 pm, Alice wins the game if she wins at least one round, Bob wins the game if he wins every round.

As Andrew notes, it seems that Bob should win. At every stage, he waits for Alice’s move, and then makes a different move. But he claims Alice has a winning strategy, as follows.

There are various ways that Bob could play throughout a whole game, but any way he plays can be encoded as an ω-sequence of 1’s and 0’s, where the nth term in the sequence represents how he responds in the nth round. Before the game starts, Alice chooses her strategy as follows. Alice divides these sequences into equivalence classes according to whether they differ by finitely many moves at most. With the help of the Axiom of Choice, Alice then picks a representative from each equivalence class and memorises it. At any point after the game has started, Alice will know what moves Bob has made at infinitely many of the rounds, and will only be ignorant of the moves Bob is yet to play, of which there are only finitely many. Thus, at any point after 12 pm, Alice will know to which equivalence class the sequence of moves Bob will eventually make belongs. Her strategy at each round, then, is to play how the representative sequence for this equivalence class predicts Bob will play at that round. If the representative sequence is right about Bob’s move at that round, Alice will win that round. However, the representative sequence and the sequence that represents how Bob actually played, must be in the same equivalence class: they must be the same at all but finitely many rounds. If Alice played according to the representative sequence at every round, then she will have won all but finitely many of the rounds, meaning that she has won the game.

I think this can’t be right for two reasons.

First, I think we can make the intuitive argument that Bob will win slightly more rigorous. (This matters because intuitive arguments are more-or-less counter-indicative of truth when it comes to reasoning about infinities.) Assume that Bob plays the strategy “Wait and see what move Alice makes, then make the opposite move.” Let F(n) be the proposition that Bob wins the round after which there are n more rounds to play. Then F(0) is clearly true – Bob will win the last round. And for arbitrary k, we can prove F(k) → F(k+1). That’s because we can prove F(k+1), and then derive F(k) → F(k+1) by ∨-introduction! Then by mathematical induction, we can infer ∀x F(x).

Second, assuming that Alice’s strategy here works seems to lead to a reductio. Assume that Bob doesn’t play the “Wait and see what Alice does” strategy. Instead he plays a mirror image of Alice’s strategy. That is, he finds a representative strategy of each member of the set of equivalence classes as Andrew defines them. He then notes, as he can do at each stage, which equivalence class Alice’s play is in. He then plays the opposite move to the representative strategy. If Andrew’s reasoning is correct, then an exactly parallel argument should prove that Bob wins all but finitely many of the rounds. But it’s impossible for each of Alice and Bob to win all but finitely many of the rounds.

So something has gone wrong in the reasoning here. I think, though I’m not sure about this, that the strategy Andrew suggests for Alice will not have any distinctive advantages. Here’s the way I picture the situation. At any move, Alice will already have lost infinitely many games, or she will not have. If she has, the strategy can’t stop her losing infinitely many games, which is its main supposed advantage. If she has not, then it doesn’t matter what strategy she plays, she still won’t end up losing infinitely many games. So playing this strategy doesn’t help either way.

But I’m not at all sure about this diagnosis – it’s definitely a good case to think about, as are all the other points that are raised in the paper.

## 24 Replies to “Andrew Bacon on Supertasks”

1. Hi Brian. Thanks for taking an interest in my paper!

I think I agree with both your arguments here – in fact I endorsed them in the paper – but I don’t see yet why this means the reasoning must have gone wrong.

Call the “make the opposite move” strategy, strategy 1, and the other one (Alice’s strategy) strategy 2. What you’ve shown is that for any play of the game:

(1) if Bob follows strategy 1 (or indeed the strategy 2 as you point out), Bob will win.

(2) If Alice follows strategy 2, Bob won’t win.

But the conclusion I drew from (1) and (2) in the paper was that in any play of the game either Bob fails to follow strategy 1 at some point, or Alice fails to follow strategy 2 at some point. This situation is strange but not inconsistent as far as I can see. (That said, I think the strangeness is present in the simpler second puzzle as well, if that helps at all.)

2. What I was resisting was the argument for (2). I think the following two claims stand and fall together.

(2a) If Alice follows strategy 2, she’ll lose at most finitely many rounds.
(2b) If Bob follows strategy 2, he’ll lose at most finitely many rounds.

I think (3) and (4) are also both true.

(3) It can’t be the case that Alice and Bob both lose at most finitely many rounds.
(4) If Alice can follow strategy 2, so can Bob.

But from (2a), (3), (4) and (2a) → (2b), we can conclude that Alice can’t follow strategy 2. So either (2a) is false, or (I think) it is true in a way that doesn’t support the larger conclusions of the paper.

Do you agree?

3. Ah I see. So what I was trying to say is that 2a, b, 3 and 4 don’t entail that Alice can’t follow strategy 2. There are worlds where Alice follows strategy 2 and worlds where Bob follows strategy 2, so (4) is certainly true. However since Alice (Bob) wins all but finitely many rounds in worlds where they follow strategy 2 none of these worlds are worlds in which they both follow strategy 2.

I’m guessing the sense in which you think (2a) is false doesn’t have to do with failures of the axiom of choice but with my translating the existence of that function into English as ‘Alice has a strategy that guarantees she’ll win’?

4. I was thinking the sense in which (2a) is false is that by the time Alice can play the strategy, she’s already lost infinitely many games. I think the proof shows that at any time, if Alice has lost finitely many games at that time, there’s a strategy that guarantees she’ll only lose finitely many games. I don’t think it shows she’ll only lose finitely many games full stop.

5. “I was thinking the sense in which (2a) is false is that by the time Alice can play the strategy, she’s already lost infinitely many games.”

Can you actually give me a case where the antecedent of (2a) is true and the consequent false?

The antecedent is just “Alice follows strategy 2 at every round”: as far as I can see it doesn’t matter if, for whatever reason, Alice only starts following the strategy towards the end of the game. The antecedent of (2a) is then false and, thus, (2a) is true. That said, I don’t see any reason to think Alice couldn’t have followed the strategy at every round: I would have thought that for any sequence of 1’s and 0’s there’s a possible play of the game where that sequence corresponds to Alice and Bob’s moves.

“I think the proof shows that at any time, if Alice has lost finitely many games at that time, there’s a strategy that guarantees she’ll only lose finitely many games.”

Well that claim is trivial. The strategy I described is chosen before 12pm and only works if it is played at every round, not just at a final segment.

“I don’t think it shows she’ll only lose finitely many games full stop.”

Maybe you could point out which step of the argument you disagree with then:

Suppose Alice follows strategy 2 at every round.

So at every round, n, Alice chose the nth move from the representative sequence of the equivalence class containing Bob’s actual play (which can be worked out only from knowing Bob’s play so far: at every round she already knows infinitely many of Bob’s plays and is only ignorant about finitely many of them.)

So (by induction if you like) Alice’s sequence is in the same equivalence class as Bob’s.

Therefore Alice has won all but finitely many rounds of the game.

(2a) follows by conditional proof.

6. I’m not sure there’s a possible world in which the antecedent of (2a) is true, so I’m not sure that there’s a possible world in which (2a)‘s antecedent is true and consequent false.

Here’s one reason for thinking the antecedent of (2a) is impossible, and so (2a) is at best trivially true.

Assume that Bob chooses 1 by saying ‘0’, and chooses 0 by saying ‘1’. (That is, assume Bob speaking a variant of English.) Now assume that Bob is not a person, but an echo, and Alice is playing the game in a valley. Then for Alice to play strategy 2 consistently, at any time she’ll have to have played at most finitely many moves that are different to Bob’s moves, i.e., that are the same as her own moves. That’s impossible, so strategy 2 is inconsistent.

Then (2a) would be true, but not in any interesting way I think. Or have I missed something about the broader dialectic here?

7. If you somehow build it into the story that Bob follows the echo strategy then, yes, following Alice’s strategy is impossible (conditional on that story.) If I somehow built it into the story that Alice follows the representative sequence strategy then it’s conditionally impossible that Bob follows the echo strategy (in those worlds the cliff must echo weird things back.) For example, what if I built it into the physics of my story that cliff’s always echo back the representative number rather than repeating the number? I don’t see an asymmetry here other than that one description is more complicated than the other.

Here’s a mirror argument that Alice can follow the strategy. Consider the sequence of all 1’s and its equivalence class. Suppose Alice chooses as its representative the all 1’s sequence. Alice now faces the valley and says 1 at every round, and gets a 1 echo. Assuming “1” means 1, Alice will have successfully followed the strategy and won every round. It seems to me that if your world is ok so is mine, and my world is a world where Alice has followed the strategy.

8. Ok maybe that was a bit fast: for Alice to count as following the strategy she also has to satisfy the various counterfactuals: “Alice would have followed the strategy if sequence t had been played.”

To get a model for that just order those worlds/round pairs in which Alice has follows the representative sequence at the next round as closer than those at which Bob follows the echo strategy.

9. I should stress, though, that the truth of (1) and (2) are sufficient for the larger conclusions of the paper, even if (2) is trivially true (which was the only point I was trying to make earlier.) The local conclusion being: that in any play of the game either Bob fails to follow strategy 1 at some point, or Alice fails to follow strategy 2 at some point.

If I concede that it’s not the first disjunct we still get the following: there is some round, r, at which Alice fails to follow the second strategy at r. Clearly Alice could have followed the strategy at round r: she could have said 1 instead of 0 (or 0 instead of 1 dependingly). We may also assume she wanted to follow the strategy at round r (we may suppose she is rewarded for following the strategy.) So it follows that Alice acted irrationally at round r, even though it’s presumably not rationally required that she follow the strategy at every round since that’s impossible (at least, if I concede that Bob’s an infallible echo machine.)

10. Yarrow says:

It seems to me that the second puzzle at least doesn’t quite match this: “A perfectly rational agent must also be disposed to act rationally in the sense that, if he were offered a given decision problem, he would respond in a way that maximised his expected utility.”

If Alice wants to maximize the cardinality of the chocolates she receives, any strategy of responding with 0 infinitely often and 1 infinitely often will work give her a countably infinite number of chocolates.

If on the other hand she is interested in the relative frequency of chocolate rounds versus no-chocolate rounds, there is no best strategy: choosing 1 every other round is inferior to choosing 1 only every fourth round, which is inferior to choosing 1 only every ten thousand rounds, which is inferior to …

The fact that for any finite n and any of the strategies above we could add the modification “and pick 0 from n down to 1” doesn’t seem to change anything: Alice will always get countably infinitely many chocolates, and the relative frequency of chocolate versus no-chocolate rounds will remain unchanged.

11. I think I see the argument a bit better now after post 9 – though Yarrow’s comment is making me worry a bit.

But I’m not sure that the second paragraph of post 9 follows from the first. We need to do more to set up why it is locally rational for Alice to follow strategy 2 at each round. And I think this will be hard. If you make it that Alice is rewarded for always following the strategy, then there’s no incentive for her to follow it at any point, since she knows she won’t get the reward. If she’s rewarded for each round for which she follows it, then it’s not clear that she won’t get infinitely many rewards whatever she does at a given round.

By the way, here’s another argument that Alice’s strategy is impossible. Emma is playing the following variant of the Alice/Bob game. At each time when Alice says a number, Emma also says a number. She wins the round if she says a different number to what she, i.e. Emma actually says, and loses if she says the same number as she actually says. So she always loses. But it looks like we can give the same argument, using strategy 2, for the possibility of Emma winning as we can for the possibility of Alice winning. Since Emma can’t win, that suggests something is wrong with that argument.

12. Hi Yarrow,

The set up was: at each round Alice can either follow the rule or not. If she follows she gets a chocolate which she eats straight away (and begins the next round with no chocolates again.) At no point does she end up with infinitely many chocolates. (This is why I used chocolate instead of money.)

With that in mind at the nth round (from the end) she is faced with a simple decision problem: get a chocolate or get nothing. It seems obvious that she should to follow the rule at round n. Since n was arbitrary, at each n she ought to follow the rule. But then, by Yablo style reasoning, there is some round where she didn’t do what she should have.

(BTW I don’t think it’s right to think of this as a single decision problem – where you choose a single strategy that covers all eventualities at the beginning and then sit back and watch it play out. See in particular the Elga, Arntzenious, Hawthorne paper on similar issues: http://philsci-archive.pitt.edu/archive/00001595/ )

13. Hi Brian,

“If she’s rewarded for each round for which she follows it, then it’s not clear that she won’t get infinitely many rewards whatever she does at a given round.”

At the time I didn’t think the problem of accumulating infinite utilities was a problem the way I set it up (see my reply to Yarrow. Actually I notice I should have said Alice get’s a reward for following the strategy puzzle 1 as well as 2.)

You could run a variant of the puzzle where you have countably many people. The nth person has exactly one chance to follow the rule and win a pound at 1/n hrs past 12. There’s an infinite amount of money being thrown about, but each person can only make 1 pound. Of course it would then be like the Alice and Bob case where one of them is going to have to fail to follow the rule, but logic doesn’t dictate which.

14. Regarding the Emma thing. Here’s how I’m thinking about it: there are a bunch of possible worlds representing different plays of the game. In fact, I’m taking it that for any pair of sequences of 1’s and 0’s there’s a world where that sequence represents the moves that Alice and Bob makes.

Now if we restrict ourselves to a certain class of worlds – say, worlds where Bob reply is always the opposite of Alices – then Alice “can’t” follow her strategy (with my modality restricted to that class of worlds.) You’ve listed a few subtle ways of restricting our attention to that class of worlds, e.g. by having Bob be an echo.

But there are also worlds, i.e. pairs of sequences of 1’s and 0’s, where Alice follows her strategy (I described one in post 7.)

By the way I don’t think this follows:

“But it looks like we can give the same argument, using strategy 2, for the possibility of Emma winning as we can for the possibility of Alice winning.”

The possibility of being able to follow the strategy doesn’t follow from the existence of the strategy (and by strategy I mean an abstract function.) For example, the Yablo strategy in puzzle 2 exists, although it’s not possible to follow it. I don’t yet see a reason to doubt the existence of that strategy. And you haven’t yet addressed the proof I gave of it’s existence. (It would be surprising if it didn’t exist as it’s the basis for some well known counterexamples in topology!)

15. First, nice paper Andrew and congratulations on getting it published! I remember when we were having the discussion on our blogs.

Anyway, I’m worried about one thing. At two points you use some linguistic expression that seems to suggest definiteness about Alice’s strategy:

“It is thus impossible for both players to follow their strategy, meaning that there must have been a point at which at least one player was able to play according to his strategy, should have done, but didn’t.

“at each point in the game rationality requires Alice to follow her strategy, yet to require that Alice follow it at every point would be to require the impossible.”

However, an important thing about Alice’s strategy is that it’s not unique. For any strategy of the sort you describe, and any particular round, replacing the recommended move on that round with its opposite also results in a strategy of the sort you describe. So in some sense, no matter what she does, on every round Alice is always making a play that is compatible with following a winning strategy.

Of course, this doesn’t apply to the second situation. In some ways that second example seems to focus my thought more on the real problem – at any point in time she might always do the right thing from then on, but she has always already done the wrong thing at some earlier stage.

16. Hi Kenny,

Yes, actually, that discussion was very helpful for me!

You’re absolutely right of course. I was assuming that a particular strategy had been chosen, and more importantly, that Alice wanted to follow it at every round. (I mentioned the second assumption at the end of the first section but didn’t make it explicit in the same way I did in the second puzzle where Alice received a spendable reward at each round if she followed the strategy.)

With that modification, though, I think it begins to look like the second puzzle in the respect you mentioned. Does that sound right?

17. Actually, here’s what I said about that in the paper anyway (section 3): “To avoid irrelevant complications we may go one further and stipulate that both players have a stake in playing a given winning strategy at each round. […] So as not to get mixed up with accumulating infinite utilities, we may think of each round as a single decision problem, having a reward which gets spent before the next round.”

18. Yarrow says:

Let me see if I’ve grasped the outlines of the argument (for your second puzzle):

Given a sequence of past moves M, and some kind of reward function, a rational agent is described by a next-move function nm such that nm(M) gives a move that maximizes the reward function in some way.

For the finite situation, it’s unproblematic to iterate this: we start with the empty sequence [], the agent’s first move gives us M1 = [nm([])], the second move gives us M2 = M1.[nm(M1)] (using the . for concatenation), the third move gives M3 = M2.[nm(M2)], and so forth.

The Elga, Arntzenious and Hawthorne paper wrestles with the fact that extending this to infinite sequences can yield unintuitive results (the reward for each finite prefix of an infinite sequence is positive, but the reward for the infinite sequence itself is negative, etc.) But at least all these sequences can be constructed (by induction), because the is-a-prefix-of relation on sequences is a well-founded partial order.

Your paper reverses the direction in which the sequences are infinite — rather than starting at the present and contemplating an infinite sequence of future choices, the agent finds herself at or near the end of an infinite series of (mostly) past choices, with no beginning point. Because is-a-suffix-of is not well-founded for infinite sequences, there exists (and you exhibit) a function nm for which (1) nm(M) is the rational choice for every M, but (2) there is no infinite M for which

M_n = nm(M after n) for every n

(where M_n is the nth element of the sequence M, and M after n is M with the nth element and all previous elements removed).

The conclusion is “It is impossible for an ideally rational agent to find themselves in a situation where they will undergo such a procedure.”

I think I agree — or at least with the slightly more modest “it is impossible for an ideally rational agent to find themselves in a situation where they have already undergone all but a finite part of such a procedure.” Does this say something about rationality, or about induction (or the lack thereof), or about the relationship between rationality and induction?

19. This is a response to comments 13 and 14, though I guess the thread has moved on a bit.

I don’t really think there’s as much of a difference between the Alice case and the Emma case as this suggests. Emma can’t play strategy 2. Alice can’t play strategy 2 if Bob plays rationally. And Alice knows that Bob will play rationally – at least we should assume that if this is a standard game. So in effect we should say that Alice can’t play strategy 2.

It’s true I guess that if Alice plays strategy 2 she’ll win. But it’s also true that if Emma plays strategy 2, she’ll win. It doesn’t follow that Alice should play strategy 2, any more than that Emma should.

Most importantly, it doesn’t follow that there’s anything particularly rational about following strategy 2 for Alice. There isn’t anything special about following strategy 2 for Emma, and I don’t think there’s much special about following it for Alice. She’ll lose whether she tries to follow it or not.

20. Yarrow: yes, that seems to capture how I’m thinking of it. You mentioned my conclusion was too strong: you suggest you could have a rational agent who’ll undergo one of these puzzles in the future. My argument was that no rational agent who will undergo one of these supertask procedures could have the right dispositional properties to be rational, and that argument only relied on modus ponens for the counterfactual conditional. Suppose Alice is rational, so, I claim, the following schema for M holds: “if Alice learnt that sequence M had been played, she’d play nm(M) on her next move”. But given that Alice will learn each initial sequence of some complete sequence M+ occurs, we could deduce she plays nm(M) at every round, by MP, which is impossible. So she doesn’t have the disposition to maximise utility (i.e. one of those counterfactuals fails.) Or have I missed something?

Brian: “I don’t really think there’s as much of a difference between the Alice case and the Emma case as this suggests. Emma can’t play strategy 2. Alice can’t play strategy 2 if Bob plays rationally. And Alice knows that Bob will play rationally – at least we should assume that if this is a standard game. So in effect we should say that Alice can’t play strategy 2.”

Right, but by symmetrical reasoning, Bob “can’t” play strategy 2 relative to the knowledge that Alice plays rationally (i.e. maximises utility at every round.) And worse, if we assume common knowledge of rationality nothing is possible. (Or do you still think there’s something peculiar about Alice’s strategy that doesn’t apply to Bob’s? Maybe this symmetrical puzzle is better to think about: Alice get’s rewarded at a round if she says 1 if Bob has said 0 at every round up until now and says 0 otherwise. Bob get’s rewarded if he says 1 when Alice has said 0 up until now and 0 otherwise. An version of Yablo reasoning show’s the set of worlds compatible with mutual knowledge of rationality – i.e. maximising utility at each round – is empty. It would be helpful to me to see how you find the two puzzles disanalogous, if you do.)

“Most importantly, it doesn’t follow that there’s anything particularly rational about following strategy 2 for Alice. There isn’t anything special about following strategy 2 for Emma, and I don’t think there’s much special about following it for Alice. She’ll lose whether she tries to follow it or not.”

Given that she is rewarded for choosing according to an (already fixed) representative strategy at each round, so that at each round following strategies recommendation at that round maximises utility, I don’t see why we shouldn’t say it’s rational. (As I’ve already stressed, it’s not like Alice chooses a strategy that covers all eventualities in advance and watches it play out, she’s free to do what she deems best at each round.)

21. Incidentally the “Emmafication” of the modified symmetrical puzzle, where Alice=Bob, is just puzzle 2. And like your Emma example, there is no world where the strategy is followed.

This doesn’t show, however, that neither Alice nor Bob in the two player game can play their strategy. If Alice says 0 and Bob say 1 at every round Bob’s followed his strategy, and symmetrically if Alice says 1 and Bob says 0 at every round Alice has followed her strategy at every round.

With that in mind, I don’t really see how your one player Emma game relates to the two player game in puzzle 1.

22. The worry was that there’s nothing in puzzle 1 that gives us a reason to think that playing rationally = playing strategy 2. Indeed, I don’t think that’s true. As stated, any strategy is as rational for her as any other, since whatever she plays, she’ll lose if Bob is rational.

If you add to puzzle 1 that Alice is rewarded for playing the representative strategy, then things might be different. Though I think it would be good to see the precise version where each of the infinitely many rewards are valuable for Alice without the game turning into a puzzle about infinite utility, or about the difficulty of choosing rationally in the game “Pick as big an integer as you can”.

23. “If you add to puzzle 1 that Alice is rewarded for playing the representative strategy, then things might be different. Though I think it would be good to see the precise version where each of the infinitely many rewards are valuable for Alice without the game turning into a puzzle about infinite utility, or about the difficulty of choosing rationally in the game “Pick as big an integer as you can”.”

I addressed the objection concerning infinite utility in the paper (see quote from post 17, and footnote 2 in the paper too) and in my reply to Yarrow above.

It also supposed to be part of the set up in puzzle 1 that Alice is rewarded at each turn for following a fixed strategy (see post 17), although I guess I didn’t stress that enough in the paper.

24. Yarrow says:

Andrew: After puzzling through this I now think that your conclusion, far from being too strong, was too weak! The paradox seems to involve something more basic than rationality.

Suppose that from 11:59 am forward Alice has the disposition, rational or otherwise, to follow the rule “choose 1, if you have chosen 0 at every previous round, and chose 0 otherwise”. Then the game can’t be played: it cannot be the case that for each n in \omega, at 1/n hours past 12 pm, Alice will be asked to choose either 1 or 0 (and be free to answer according to her disposition).

The three interpretations of this that I can see are (1) that Alice’s disposition makes playing the game impossible — had she some other disposition (e.g., to follow the strategy “choose 1 at each round”) then the game could be played; (2) that the possible future occurrence of the game makes it impossible for Alice to have the given disposition, or (3) that the situation is impossible tout court — no matter what disposition Alice has, she cannot be asked to act on it at 1/n hours past 12 pm for each n in \omega.

Interpretations (1) and (2) require accounts of causality that hard to wrap my head around. I think (3) can be whittled down to a statement that it’s inappropriate to consider a state machine to have generated an infinite sequence of values this way (by considering at each step an infinite past to generate the next piece of a finite future), because there is no first instant at which the machine can be said to have started processing the sequence. (At 12 pm, Alice has made no move in the game. At every instant after 12 pm, she has made all but finitely many of an infinite number of moves.)

In other words, we need mathematical induction to start talking about the result of applying state machines to sequences, let alone whether the results of such applications are rational.