Skip to main content.
15 August, 2011

Wolverine!

Starting January 1, Ishani Maitra and I will be starting new positions in the University of Michigan philosophy department. I’m going to be the inaugural Marshall M. Weinberg Professor of Philosophy, which is an incredible honour.

I’m really looking forward to being part of (another) great philosophy program. I’ve been incredibly impressed with the way Michigan has gone about its hiring in recent years, and you don’t need me to tell you how amazing the longer serving faculty there are. Indeed, both the newer and the older faculty there are so good that I’ve repeatedly tried to hire several of them away at my previous jobs!

Living in Ann Arbor will be great for the three of us. I’m looking forward to being able to walk to work, and to the markets, and to great public schools. And I’m really looking forward to having these folks as colleagues and neighbours. I’d make a list of which things I’m most looking forward to professionally, but it would be too long, and I’m sure to inadvertently leave something or someone out. And in any case, I suspect that most readers of this blog don’t need to be told how fantastic the philosophers at Michigan are, to put it mildly.

While I will miss many things about the Rutgers philosophy department (and linguistics and cog sci departments), one thing I won’t miss having to worry about what might happen to my job thanks to changes in state government policy. Twice, proposals that would have made it impossible to work at Rutgers and live in New York passed a house of the state legislature. In both cases the bills that resulted didn’t directly affect academics, but that this kind of thing would even be proposed was worrying. What passed was a huge rise in health premiums (effectively a 5% pay cut for most faculty), and an unknown extra rise in health care if you have any interest in seeing a doctor outside New Jersey (UPDATE: I made a couple of mistakes in making this calculation, see below for labourious details). The budget cuts from Trenton have also seen cuts in carried forward research accounts, and non-payment of contractually agreed pay raises. And who knows what they’ll think up next? All this made the choice much easier.

That said, there has been a lot I’ve loved about being part of the Rutgers philosophy department. I’m particularly fond of the current crop of grad students we have, who are truly great philosophers, and great people. I bet in a couple of decades time, people will look back and say, “There was a seminar that had her and him and her and … in it? That’s like having a philosophy all-stars conference every week.” And I’ll be like, “Yep, and I was, at least nominally, teaching them.” Those of you who are on search committees over the next few years will hear much much more from me about these students, so I won’t go on too much more here. My colleagues-to-be in particular will be hearing about them a lot. (I suspect at some point I’ll be replaced on a search committee by a talking dummy saying, “I think we should hire the Rutgers student”, and no one will spot the difference.)

Building a grad student body this good takes a lot of work, but I think Jeff King (as DGS) and Ruth Chang and Jason Stanley (as admissions directors in recent years) deserve a lot of credit for it. Impressively, the students aren’t just thriving philosophically, but seem to be happier than one could reasonably expect graduate students to be. I think that wouldn’t have happened without the hard work several faculty members have put in to making the grad program work so well.

I don’t know the current Michigan students nearly as well, so it’s impossible to make any comparisons. I have been impressed by the people (and work) I have seen, so I have high hopes for what things will be like. I’m looking forward to more seminars with a different batch of philosophy all-stars to be!

UPDATE: I oversimplified and overstated the recent changes in NJ health care law, so I should correct this. What happened is that the costs of being part of the NJ health plan went from a fixed percentage of salary (roughly 1.5%), to a sliding percentage of premium costs, dependent on one’s income. The effect of this will be complicated, because it is being phased in over time, but I think the following is all true. (Some of this is taken from this calculator, which models what will happen if there’s no change in wages or health insurance premiums over the course of implementing the plan.)

I still think the changes were a very bad idea, and I’m glad to not have to worry about them more. And I’m especially glad I don’t have to worry about what effect splitting the insured pool like this will do to premiums for those in the smaller group, which I think is very hard to model. (One data point for the model: at University of Michigan, if we moved out of state, our health care contributions would more than double.) I have an aversion to this kind of uncertainty, so this bothered me more than it might bother other people; we’ll see in ten years time how worried I should have been.

But the changes weren’t as bad, or as simple, as I suggested in the post, hence this correction. And I apologise for the errors.

Posted by Brian Weatherson at 12:42 am

1 Comment »

15 July, 2011

Lecture Notes on Game Theory

Over the summer I did a short seminar series on game theory at Arché. I thought it would be worthwhile to post my notes from that seminar. The notes need some revising, particularly to take account of the points made in the previous post, but I hope that even in this form they’ll be of some interest.

Posted by Brian Weatherson at 4:46 pm

No Comments »

Game Theory as Epistemology

I taught a series of classes on game theory over the last few weeks at Arché. And one of the things that has always puzzled me about game theory is that it seems so hard to reduce orthodox views in game theory to orthodox views in decision theory. The puzzle is easy enough to state. A fairly standard game theoretic treatment of Matching Pennies and Prisoners’ Dilemma involves the following two claims.

(1) In Matching Pennies, the uniquely rational solution involves each player playing a mixed strategy.
(2) In Prisoners’ Dilemma, the uniquely rational solution is for each player to defect.

Causal decision theory says denies that mixed strategies can ever be better than all of the pure strategies of which they are mixtures, at least for strategies that are mixtures of finitely many pure strategies. So a causal decision theorist wouldn’t accept (1). And evidential decision theory says that sometimes, for example when one is playing with someone who is likely to do what you do, it is rational to cooperate in Prisoners’ Dilemma. So it seems that orthodox game theorists are neither causal decision theorists nor evidential decision theorists.

So what are they then? For a while, I thought they were essentially ratificationists. And all the worse for them, I thought, since I think ratificationism is a bad idea. But now I think I was asking the wrong question. Or, more precisely, I was thinking of game theoretic views as being answers to the wrong question.

The first thing to note is that problems in decision theory have a very different structure to problems in game theory. In decision theory, we state what options are available to the agent, what states are epistemically possible and, and this is crucial, what the probabilities are of those states. Standard approaches to decision theory don’t get off the ground until we have the last of those in place.

In game theory, we typically state things differently. Unless nature is to make a move, we simply state what options are available to the players, and what plays are available to each of the actors, and of course what will happen given each combination of moves. We are told that the players are rational, and that this is common knowledge, but we aren’t given the probabilities of each move. Now it is true that you could regard each of the moves available to the other players as a possible state of the world. Indeed, I think it should be at least consistent to do that. But in general if you do that, you won’t be left with a solvable decision puzzle, since you need to say something about the probabilities of those states/decisions.

So what game theory really offers is a model for simultaneously solving for the probability of different choices being made, and for the rational action given those choices. Indeed, given a game between two players, A and B, we typically have to solve for six distinct ‘variables’.

  1. A’s probability that A will make various different choices.
  2. A’s probability that B will make various different choices.
  3. A’s choice.
  4. B’s probability that A will make various different choices.
  5. B’s probability that B will make various different choices.
  6. B’s choice.

The game theorists method for solving for these six variables is typically some form of reflective equilibrium. A solution is acceptable iff it meets a number of equilibrium constraints. We could ask about whether there should be quite so much focus on equilibrium analysis as we actually find in game theory textbooks (and journal articles), but it is clear that solving a complicated puzzle like this using reflective equilibrium analysis is hardly outside the realm of familiar philosophical approaches

Looked at this way, it seems that we should think of game theory really not as part of decision theory, but as much a part of epistemology. After all, what we’re trying to do here is solve for what rationality requires the players credences to be, given some relatively weak looking constraints. We also try to solve for their decisions given these credences, but it turns out that is an easy part of the analysis; all the work is in the epistemology. So it isn’t wrong to call this part of game theory ‘interactive epistemology’, as is often done.

What are the constraints on an equilibrium solution to a game? At least the following constraints seem plausible. All but the first are really equilibrium constraints; the first is somewhat of a foundational constraint. (Though note that since ‘rational’ here is analysed in terms of equilibria, even that constraint is something of an equilibrium constraint.)

That much seems relatively uncontroversial, assuming that we want to go along with the project of finding equilibria of the game. But those criteria alone are much too weak to get us near to game theoretic orthodoxy. After all, in Matching Pennies they are consistent with the following solution of the game.

Every player maximises expected utility given the other player’s expected move. Each player is correct about their own move. And each player treats the other player as being rational. So we have many aspects of an equilibrium solution. Yet we are a long way short of a Nash equilibrium of the game, since the outcome is one where one player deeply regrets their play. What could we do to strengthen the equilibrium conditions? Here are four proposals.

First, we could add a truth rule.

This is a worthwhile enough constraint, albeit one considerably more externalist friendly than the constraints we usually use in decision theory. But it doesn’t rule out the ‘solution’ I described here, since everything the players believe is true.

Second, we could add a converse truth rule.

This would rule out our ‘solution’. After all, neither player believes the other player will play Heads, but both players will in fact play Heads. But in a slightly different case, the converse truth rule won’t help.

Now nothing is guaranteed by the players’ beliefs about their own play. But we still don’t have a Nash equilibrium. We might wonder if this is really consistent with converse truth. I think this depends on how we interpret the first clause. If we think that the first clause must mean that each player will use a randomising device to make their choice, one that has a 0.9 chance of coming up heads, then converse truth would say that each player should believe that they will use such a device. And then the Principal Principle would say that each player should have credence 0.9 that the other player will play Heads, so this isn’t an equilibrium. But I think this is an overly metaphysical interpretation of the first clause. The players might just be uncertain about what they will play, not certain that they will use some particular chance device. So we need a stronger constraint.

Third, then, we could try a symmetry rule.

This will get us to Nash equilibrium. That is, the only solutions that are consistent with the above constraints, plus symmetry, are Nash equilibria of the original game. But what could possibly justify symmetry? Consider the following simple cooperative game.

Each player must pick either Heads or Tails. Each player gets a payoff of 1 if the picks are the same, and 0 if the picks are different.

What could justify the claim that each player should have the same credence that A will pick Heads? Surely A could have better insight into this! So symmetry seems like too strong a constraint, but without symmetry, I don’t see how solving for our six ‘variables’ will inevitably point to a Nash equilibrium of the original game.

Perhaps we could motivate symmetry by deriving it from something even stronger. This is our fourth and final constraint, called uniqueness.

Assume also that players aren’t allowed, for whatever reason, to use knowledge not written in the game table. Assume further that there is common knowledge of rationality, as we usually assume. Now uniqueness will entail symmetry. And uniqueness, while controversial, is a well known philosophical theory. Moreover, symmetry plus the idea that we are simultaneously solving for the players’ beliefs and actions gets us the result that players always believe that a Nash equilibrium is being played. And the correctness condition on player beliefs means that rational players will always play Nash equilibria.

So we sort of have it, an argument from well-known (if not that widely endorsed) philosophical premises to the conclusion that when there is common knowledge of rationality, any game ends up in a Nash equilibrium.

Of course, we’ve used a premise that entails something way stronger. Uniqueness entails that any game has a unique rational equilibrium. That’s not, to put it mildly, something game theorists usually accept. The little coordination game I presented from a few paragraphs back is a game that sure looks like it has multiple equilibria! So I haven’t succeeded in deriving orthodox game theoretic conclusions from orthodox philosophical premises. But I think this epistemological tack is a better way to make game theory and philosophy look a little closer than they look if one starts thinking of game theorists as working on special (and specially important) cases of decision problems.

Posted by Brian Weatherson at 4:12 pm

1 Comment »

11 July, 2011

Knowing How, Regresses and Frames

I’m just back from my annual trip to St Andrews to work at Arché. It was lots of fun, as always. The highlight of the trip was taking the baby overseas for the first time, and letting her meet so many great people, especially the other babies. And there was lots of other fun besides. I taught a 9-seminar class on game theory. I have to revise my notes a bit to correct some of the mistakes that became clear in discussion there, but hopefully soon I’ll post them.

Over the last two weekends I was there, there were two very interesting conferences. The first was on the interface between the study of language and the study of philosophy. The second was on knowing how. I didn’t get to attend all of it, so it’s possible that the things I’ll be saying here were addressed in talks I couldn’t make. And this isn’t really my field, so I suspect much of what I’m saying here will be old news to cognoscenti. But I thought that at times some of the anti-Ryleans understated, or at least misstated, the force of Ryle’s arguments.

Regress Arguments

Jason Stanley briefly touched on the regress argument Ryle gives in favour of a distinction between knowing how and knowing that. Or, at least, he briefly touched on a regress argument that Ryle gives, though I think this isn’t Ryle’s only regress argument. Here’s a rough version of the argument Jason attributes to Ryle.

This is a pretty terrible argument I think, and Jason did a fine job demolishing it. For one thing, whatever it means to say that knowing that is static, knowing how might be just as static. And given a functionalist/dispositionalist account of content, it just won’t be true that knowing that is static in the relevant sense. If an agent never has the disposition to go to the fridge even though they have a strong desire for beer, and no conflicting dispositions/impediments, then they don’t really believe there is beer in the fridge, so don’t know that there is beer in the fridge.

This way of presenting Ryle makes it sound like knowing how is some kind of ‘vital force’, and Ryle himself is a vitalist, looking for the magical force that is behind self-locomotion. I don’t think that’s a particularly fair way, though, of looking at Ryle. A better approach, I think, starts with consideration of the following kind of creature.

The creature is very good, indeed effectively perfect, at drawing conclusions of the form I should φ. But they do not always follow this up by doing φ. If you think it is possible to form beliefs of the form I should φ without ever going on to φ, or even forming a disposition to φ, imagine the creature is like that. If you think that’s impossible, perhaps on functionalist grounds, imagine that the creature moves from knowledge she expresses with I should φ to actually doing φ as rarely as is conceptually possible. (I set aside as absurd the idea that the functionalist characterisation of mental content rules out there being large differences in how frequently creatures move from I should φ to actually doing φ.)

I think such a creature is missing something. If they frequently don’t do φ in cases where it would be particularly hard, what they might be missing is willpower. But let’s not assume that. They frequently just don’t do what they think they should do, given their interests, and often instead do something harder, or less enjoyable. But what they are missing doesn’t seem to be propositional knowledge, since by hypothesis they are very good at figuring out what they should do, and if they were missing propositional knowledge, that’s what they would be missing.

What they might be missing is a skill, such as the skill of acting on one’s normative judgments. But I think Ryle has a useful objection to that characterisation. It is natural to characterise the person’s actions as stupid, or more generally unintelligent, when they don’t do what they can quite plainly see they should do. A person who lacks a skill at digesting hot dogs quickly, or playing the saxaphone, or sleeping on an airplane, isn’t thereby stupid or even unintelligent. (Though they might be stupid if they know they lack these skills and nevertheless try to do things that call for such a skill.) Indeed, we typically criticise cognitive failings as being unintelligent. So our imagined creature must have a cognitive failing. And that failing must not be an absence of knowledge that, since by hypothesis that isn’t lacking. So we call what is lacking knowledge how.

Note that I really haven’t given an argument that this is the kind of thing that natural language calls knowing how. It’s consistent with this argument that everything that is described as knowing how in English is in fact a kind of knowing that. But it is an argument that there is some cognitive skill that plays one of the key roles in regress-stopping that Ryle attributed to knowing how.

Ryle on the Frame Problem

There’s another problem for a traditional theory that identifies knowledge with knowing that, and it is the frame problem. Make the following assumptions about a creature.

It seems to me that such a creature has to work out a way to use its knowledge that of propositions like That q is true is irrelevant to my decision about whether to do ψ in making practical deliberations without actually drawing on that knowledge. If it does draw on it, the computational costs will go to infinity, and nothing will get done. In short, it has to be able to ignore propositions like q, and it has to ignore them without thinking about whether to ignore them.

It seems that a skill like this is not something one gets by simply having a lot of knowledge. One can know all you like about how propositions like q should be ignored in practical deliberation. But it won’t help a bit if you have to go through the propositions one by one and conclude that they should be ignored, even if you can do all this subconsciously.

Moreover, it is a sign of intelligence to have such a skill. Someone whose mind drifts onto thoughts about the finer details of French medieval history when trying to decide whether to catch this local train or wait for the express is displaying a kind of unintelligence. As above, Ryle concludes from that that the skill is a distinctively cognitive skill, and worthy of being called a kind of knowledge. Since it isn’t knowledge that – our creature has all the salient knowledge that – it is a kind of knowing how.

Now I assume that the five assumptions I made above are actually true of creatures like us. Perhaps they are not; perhaps we have a way of drawing on knowledge that which doesn’t involve any computational costs. But I rather doubt that’s true. I think that we drawing on knowledge by using it computationally, and computational usage is by definition costly. Not nearly as costly as conscious thought, but costly. Many of us are sensitive to our knowledge of unimportance without drawing on it; we make decisions about whether to catch the local or the express without first considering whether French medieval history is relevant, and deciding that it isn’t. But this is because we know how to ignore irrelevant information, not merely that we know that the irrelevant information is irrelevant. Knowing that it is irrelevant is no use if you don’t know how to adjust your decision making process in light of that knowledge.

Posted by Brian Weatherson at 3:13 pm

4 Comments »

24 June, 2011

Equal Weight and Asymmetric Uncertainty

Not for the first time, I’m unsure what the Equal Weight View says about a case. Here it is.

Jack and Jill and Microsoft Excel

Jack and Jill both have the same evidence; they know that a paricular table has 83 rows and 97 columns. They both try to figure out how many cells are in the table. They know that the number of cells quals the number of rows times the number of columns. Jill concludes that it has 8051 cells. Jack can’t conclude anything about the number of cells. He thinks it might be 8051, but it might be something else. When they are each told about the other’s conclusions (or lack thereof), what should their final conclusions be?

A position half-way between Jack’s view and Jill’s view is presumably something like a view that the table probably has 8051 cells, but that it is a serious possibility that the table has a differen number of cells.

But surely Jill shouldn’t move her views in that way, should she? If she should hold firm here, is this just a counterexample to (some versions of) the Equal Weight View?

Posted by Brian Weatherson at 9:00 am

5 Comments »

30 May, 2011

Cian Dorr on Imprecise Credences

In the latest Philosophical Perspectives, Cian Dorr has a very interesting paper about a puzzle about what he calls the Eternal Coin. I hope to write more about the particular puzzle in future posts, but I wanted to mention one thing that comes up in passing about imprecise probabilities. In the course of rejecting a solution to one puzzle in terms of imprecise probabilities, he says

My main worries about this response are worries about the unsharp credence framework itself. In my view, there is no adequate account of the way unsharp credences should be manifested in decision-making. As Adam Elga has recently compellingly argued, the only viable strategies which would allow for someone with an unsharp credential state to maintain a reasonable pattern of behavioural dispositions over time involve, in effect, choosing a particular member of the representor as the one that will guide their actions. (The choice might be made at the outset, or might be made by means of a gradual process of narrowing down over time; the upshot is much the same.) And even though crude behaviourism must be rejected, I think that if this is all we have to say about the decision theory, we lack an acceptable account of what it is to be in a given unsharp credential state—we cannot explain what would constitute the difference between someone in a sharp credential state given by a certain conditional probability function, and someone in an unsharp credential state containing that probability function, who had chosen is as the guide to their actions. Unsharp credential states seem to have simply been postulated as states that get us out of tricky epistemological dilemmas, without an adequate theory of their underlying nature. It is rather as if some ethicist were to respond to some tricky ethical dilemma—say, whether you should join the Resistance or take care of your ailing mother—by simply postulating a new kind of action that is stipulated to be a special new kind of combination of joining the Resistance and taking care of your mother which lacks the objectionable features of obvious compromises (like doing both on a part-time basis or letting the outcome be determined by the roll of a dice). It would be epistemologically very convenient if there was a psychological state we could rationally be in in which we neither regarded P as less likely than HF, regarded HF as less likely than P, nor regarded them as equally likely. But we should be wary of positing psychological states for the sake of epistemological convenience.

I actually don’t think that imprecise (or unsharp) credences are the solution to the particular problem Cian is interested in here; I think the solution is to say the relevant credences are undefined, not imprecise. But I don’t think this is a compelling objection to imprecise credences either.

It is, I think, pretty easy to say what the behavioural difference is between imprecise credences and sharp credences, even if we accept (as I do!) what Adam and Cian have to say about decision making with imprecise credences. The difference comes up in the context of giving advice and evaluating others’ actions. Let’s say that my credence in p is imprecise over a range of about 0.4 to 0.9, and that I make decisions as if my credence is 0.7. Assume also that I have to make a choice between two options, X and Y, where X has a higher expected return iff p is more likely than not. So I choose X. And assume that you have the same evidence as me, and face the same choice.

On the sharp credences framework, I should advise you to do X, and should be critical of you if you don’t do X. On the imprecise credences framework, I should say that you could rationally make either choice (depending on what other choices you had previously made), and shouldn’t criticise you for making either choice (unless it was inconsistent with other choices you’d previously made).

I don’t want to argue here that it makes sense to separate out the role of credences in decision making from the role they play in advice and evaluation. All I do want to argue here is that once we move beyond decision making, and think about advice and evaluation as well, there is a functional difference between sharp and unsharp credences. So the functionalist argument that there is no new state here collapses.

One other note about this argument. I don’t think of sharp and unsharp credences as different kinds of states, or as states that need to be separately postulated and justified. What I think are the fundamental states are comparative credences. The claim that all credences are sharp then becomes the (wildly implausible) claim that all comparative credences satisfy certain structural properties that allow for a sharp representation. The claim that all credences should be sharp becomes the (still implausible, but not crazy) claim that all comparative credences should satisfy those structural properties. Either way, there’s nothing new about unsharp credences that needs to be justified. What needs to be justified is the ruling out of some structural possibilities that look prima facie attractive.

Posted by Brian Weatherson at 5:14 pm

11 Comments »

16 May, 2011

What is the Equal Weight View of Disagreement?

I often find it hard to apply the Equal Weight View (EWV) in practice. This makes it my task of counterexample generating a little harder than I feel it should be. I can come up with all sorts of cases where I think EWV gets the wrong result, but then I get worried that EWV doesn’t actually say what I think it says about that case. Here’s one example I was working with.

A and B are peers in the salient sense. They have a long track record of checking each other’s work, and they both get things right a high and equal proportion of the time. There is no external evidence that B is in any way epistemically compromised right now. They both try to work out 14 times 27, and A gets 378, while B gets 368. What should A’s credence be that the right answer is 368?

I think the EWV is committed to the answer being 0.5 or thereabouts. After all, A and B are peers, they are just as likely to get the answer right, and probably one of them did get the answer right. So the EWV-endorsed probability distribution, I would think, is that the answers 378 and 368 both get probability nearly 0.5, and the remainder goes to the possibility that they were both wrong.

This strikes me as implausible, since it is easy for A to see that 368 is the wrong answer by using the rule I’ll call D9.

D9. A number is a multiple of 9 iff the sum of the digits of its base-10 representation is a multiple of 9.

So I think this is a case where EWV is wrong, A shouldn’t assign equal weight to 378 and 368 being the correct answer. I can imagine some people denying this, and saying that 378 and 368 should be given equal weight. But I can also imagine some people denying that EWV really has that consequence.

If you’re an EWV-theorist, do you think EWV entails in this case that A should give equal credence to 378 and 368 being the correct answer? If the case is too vaguely described to answer that, consider some of the following variations.

Variation 1. A doesn’t commit to an answer before checking that it is consistent with D9. So that the answer 378 is consistent with D9 is part of her reason for believing the answer is 378. That means, I think, that Christensen’s Independence principle would rule out her going on to use D9 to conclude that B must have made a mistake.

Variation 2. B has never heard of D9. Perhaps this means A and B aren’t peers, because D9 is some evidence that A has and B lacks.

Variation 3. B doesn’t believe D9. Perhaps that’s because he thinks A is misremembering the rule (It’s really a rule for multiples of 11, not multiples of 9), or perhaps because he thinks there are restrictions on the rule (e.g., it is only guaranteed to work for numbers with an even number of digits).

Variation 4. B denies that all multiples of 27 are multiples of 9.

Variation 5. B denies that his answer is inconsistent with D9, since 3+6+8 = 18, while 3+7+8 = 19, so D9 actually supports his answer, not A’s.

I can sort of see how an EWV theorist would deny that EWV applies in variations 2 and 4, but in all the other cases, it seems to me that EWV implies, incorrectly, that A should give equal credence to 368 and 378 being the correct answer. But maybe that’s just because I haven’t understood EWV correctly. Anyone want to correct my understanding?

Posted by Brian Weatherson at 2:22 pm

2 Comments »

10 May, 2011

Philosophy Compass, Volume 6, Issue 5

Philosophy Compass


Cover image for Vol. 6 Issue 5

May 2011


Volume 6, Issue 5, Pages 300–373

Continental


Transcendental Arguments About Other Minds and Intersubjectivity (pages 300–311)
Matheson Russell and Jack Reynolds
Article first published online: 4 MAY 2011 | DOI: 10.1111/j.1747-9991.2011.00394.x
Abstract, Full Article (HTML), PDF, References.

Epistemology


Bayesianism I: Introduction and Arguments in Favor, (pages 312–320)
Kenny Easwaran
Article first published online: 4 MAY 2011 | DOI: 10.1111/j.1747-9991.2011.00399.x
Abstract, Full Article (HTML), PDF, References.

Bayesianism II: Applications and Criticisms, (pages 321–332)
Kenny Easwaran
Article first published online: 4 MAY 2011 | DOI: 10.1111/j.1747-9991.2011.00398.x
Abstract, Full Article (HTML), PDF, References.

History of Philosophy


Reidian Metaethics: Part I, (pages 333–340)
Terence Cuneo
Article first published online: 4 MAY 2011 | DOI: 10.1111/j.1747-9991.2011.00393.x
Abstract, Full Article (HTML), PDF, References.

Reidian Metaethics: Part II, (pages 341–349)
Terence Cuneo
Article first published online: 4 MAY 2011 | DOI: 10.1111/j.1747-9991.2011.00392.x
Abstract, Full Article (HTML), PDF, References.

Logic & Language


Technical Modal Logic, (pages 350–359)
Marcus Kracht
Article first published online: 4 MAY 2011 | DOI: 10.1111/j.1747-9991.2011.00396.x
Abstract, Full Article (HTML), PDF, References

Metaphysics


The Open Future, (pages 360–373)
Stephan Torre
Article first published online: 4 MAY 2011 | DOI: 10.1111/j.1747-9991.2011.00395.x
Abstract, Full Article (HTML), PDF, References

Posted by Brian Weatherson at 3:19 pm

No Comments »

Experimental Philosophy Month

If you go to this link and click on the large picture of a brain on the left-hand side, you can participate in a bunch of experiments through the Yale Experimental Philosophy Month program. If you don’t like the experiment that comes up, just go back to the first page and click on the brain again!

Posted by Brian Weatherson at 3:15 pm

No Comments »

25 April, 2011

Accuracy measures on conditional probabilities

I just proved a result about probability aggregation that I found rather perplexing. The proof, and even the result, is a little too complicated to put in HTML, so here it is in PDF.

What started me thinking about this was Sarah Moss’s excellent paper Scoring Rules and Epistemic Compromise, which is about aggregating probability functions. Here’s, roughly, the kind of puzzle that Moss is interested in. We’ve got two probability functions Pr1 and Pr2, and we want to find some probability function Pr that ‘aggregates’ them. Perhaps that’s because Pr1 is my credence function, Pr2 is yours, and we need to find some basis for making choices about collective action. Perhaps it is because the only thing you know about a certain subject matter is that one expert’s credence function is Pr1, another’s function is Pr2, and each expert seems equally likely, and you want to somehow defer equally to the two of them. (Or, perhaps, it is because you want to apply the Equal Weight View of disagreement. But don’t do that; the Equal Weight View is false.)

It seems there is an easy solution to this. For any X, let Pr(X) = (Pr1(X) + Pr2(X))/2. But as Barry Loewer noted many years ago, this solution has some costs. Let’s say we care about two propositions, p and q, and Boolean combinations of them. And say that p and q are probabilistically independent according to both Pr1 and Pr2. Then this linear mixture approach will not in general preserve independence. So there are some costs to it.

One of the thing Moss does is come up with an independent argument for using linear mixtures. Her argument turns on various accuracy measures, or what are sometimes called scoring rules, for probability functions. (Note that I’m leaving out a lot of the interesting stuff in Moss’s paper, which goes into a lot of detail about what happens when we get further away from the Brier scores that are the focus here. Anyone who is at all interested in these aggregation issues, which are pretty central to current epistemological debates, should read her paper.)

Thanks to Jim Joyce’s work there has been an upsurge in interest in philosophy in accuracy measures of probability functions. Here’s how the most commonly used scoring rule, called the Brier score, works. We start with a partition of possibility space, the partition that we’re currently interested in. In this case it would be {pq, p ∧ ¬q, ¬pq, ¬p ∧ ¬q}. For any proposition X, say V(X, w) is 1 if X is true, and 0 if X is false. Then we ‘score’ a function Pr in world w by summing (Pr(X) – V(X, w))2, as X takes each value in the partition. This is a measure of how inaccurate Pr is in w, the higher this number is, the more inaccurate Pr is. Conversely, the lower it is, the more accurate it is. And accuracy is a good thing obviously, so this gives us a kind of goodness measure on probability functions.

Now in the aggregation problem we’re interested in here, we don’t know what world we’re in, so this isn’t directly relevant. But instead of looking at the actual inaccuracy measure of Pr, we can look at its expected inaccuracy measure. ‘Expected’ according to what, you might ask. Well, first we look at the expectation according to Pr1, and then the expectation according to Pr2, then we average them. That gives a fair way of scoring Pr according to each Pr1 and Pr2.

One of the things Moss shows is that this average of expected inaccuracy is minimised when Pr is the linear average of Pr1 and Pr2. And she offers good reasons to think this isn’t a quirk of the scoring rule we’re using. It doesn’t matter, that is, that we’re using squares of distance between Pr(X) and V(X); any ‘credence-eliciting’ scoring rule will plausibly have the same result.

But I was worried that this didn’t really address the Loewer concern directly. The point of that concern was that linear mixtures get the conditional probabilities wrong. So we might want instead to measure the accuracy of Pr’s conditional probability assignments. Here’s how I thought we’d go about that.

Consider the four values Pr(p | q), Pr(p | ¬q), Pr(q | p), Pr(q | ¬p). In any world w, two of the four ‘conditions’ in these conditional probabilities will be met. Let’s say they are p and ¬q. Then the conditional inaccuracy of Pr in that world will be (Pr(q | p) – V(q))2 + (Pr(p | ¬q) – V(p))2. In other words, we apply the same formula as for the Brier score, but we use conditional rather than unconditional probabilities, and we just look at the conditions that are satisfied.

From then on, I thought, we could use Moss’s technique. We’ll look for the value of Pr that minimises the expected conditional inaccuracy, and call that the compromise, or aggregated, function. I guessed that this would be the function we got by taking the linear mixtures of the original conditional probabilities. That is, we would want to have Pr(p | q) = (Pr1(p | q) + Pr2(p | q))/2. I thought that, at least roughly, the same reasoning that implied that linear mixtures of unconditional probabilities minimised the average expected unconditional inaccuracy would mean that linear mixtures of conditional probabilities minimised the average expected conditional inaccuracy.

I was wrong.

It turns out that, at least in the case where p and q are probabilistically independent according to both Pr1 and Pr2, the function that does best according to this new rule is the same linear mixture as does best under the measures Moss looks at. This was extremely surprising to me. We start with a whole bunch of conditional probabilities. We need to aggregate them into a joint conditional probability distribution that satisfies various nice constraints. Notably, these are all constraints on the resultant conditional probabilities, and conditional probabilities are, at least relative to unconditional probabilities, fractions. Normally, one does not get nice results for ‘mixing’ fractions by simply averaging numerators and denominators. But that’s exactly what we do get here.

I don’t have a very good sense of why this result holds. I sort of do understand why Moss’s results hold, I think, though perhaps not well enough to explain! But just why this result obtains is a bit of a mystery to me. But it seems to hold. And I think it’s one more reason to think that the obvious answer to our original question is the right one; if you want to aggregate two probability functions, just average them.

Posted by Brian Weatherson at 4:23 pm

14 Comments »

« Previous Page« Previous Entries  Next Entries »Next Page »