Updating Vague Probabilities

I’ve been thinking a bit recently about the following position, and I couldn’t see any obvious reason why it was _incoherent_, so I was wondering whether (a) it might be true or (b) I was missing something obvious about why it was incoherent. So feedback on it is more than welcomed.

Many Bayesians model rational agents using the following two principles

* At any moment, the agent’s credal states are represented by a probability function.
* From moment to moment, the agent’s credal states are updated by conditionalisation on the evidence received.

Of course these are idealisations, and many other people have been interested in relaxing them. One relaxation that has got a lot of attention in recent years is the idea that we should represent agents not by single probability functions, but by _sets_ of probability functions. We then say that the agent regards q as more probable than r iff for all probability functions Pr in the set, Pr(q) > Pr(r). This allows that the agent need not hold that q is more probable than r, or r more probable than q, or that q and r are equally probable, for arbitrary q and r. And that’s good because it isn’t a rationality requirement that agents make pairwise probability judgments about all pairs of propositions.

Now what effect on the model does this relaxation have on the principle about updating? The standard story (one that I’ve appealed to in the past) is that the ideal agent updates by conditionalising all the functions in the set. So if we write PrE for the function such that PrE(x) = Pr(x | E), and S is the set of probability functions representing the agent’s credal state before the update, then {PrE: Pr is in S} is the set we get after updating.

Here’s the option I now think should be taken seriously. Sometimes getting evidence E is a reason for the agent to have more determinate probabilistic opinions than she previously had. (I’m using ‘determinate’ in a sense such that the agent represented by a single probability function has maximally determinate probabilistic opinions, and the agent represented by the set of all probability functions has maximally indeterminate opinions.) In particular, it can be a reason for ‘culling’ the set down a little, as well as conditionalising on what remains. So we imagine that updating on E involves a two-step process.

* Replace S with U(S, E)
* Update U(S, E) to {PrE: Pr is in U(S, E)}

In this story, U is a function that takes two inputs: a set of probability functions and a piece of evidence, and returns a set of probability functions that is a subset of the original set. (The last constraint might want to be weakened for some purposes.) Intuitively, it tells the agent that she needn’t have worried that certain probability functions were the ones she should be using. We can put forward formal proposals for U, such as the following

bq. Pr is in U(S, E) iff Pr is in S and there is no Pr* in S such that Pr*(E) > 2Pr(E)

That’s just an illustration, but it’s one kind of thing I have in mind. (I’m particularly interested in theories where U is only knowable a posteriori, so it isn’t specifiable by such an abstract rule that isn’t particularly responsive to empirical evidence. So don’t take that example too seriously.) The question is, what could we say against the coherence of such an updating policy?

One thing we certainly can’t say is that it is vulnerable to a Dutch Book. As long as U(S, E) is always a subset of S, it is easy to prove that there is no sequence of bets such that the agent regards each bet as strictly positive when it is offered and such that the sequence ends in sure loss. In fact, as long as U(S, E) overlaps S, this is easy to show. Perhaps there is some way in which such an agent turns down a sure gain, though I can’t myself see such an argument.

In any case, the original Dutch Book argument for conditionalisation always seemed fairly weak to me. As Ramsey pointed out, the point of the Dutch Book argument was to dramatise an underlying inconsistency in credal states, and there’s nothing inconsistent about adopting any old updating rule you like. (At the Dutch Book symposium last August this point was well made by Colin Howson.) So the threshold for endorsing a new updating rule might be fairly low.

It might be that the particular version of U proposed above is non-commutative. Even if that’s true, I’m not 100% sure it’s a problem, and in any case I’m sure there are other versions of U that are commutative.

In the absence of better arguments, I’m inclined to think that this updating proposal is perfectly defensible. Below the fold I’ll say a little about why this is philosophically interesting because of its connection to externalist epistemologies and to dogmatism.

On the standard Bayesian model, the ideal agent assigns probability x to p after receiving evidence E iff she assigns probability x to p given E a priori. That is, she knows a priori what credences are justified by which evidence. But it isn’t obvious, to say the least, that these relations are always knowable a priori. Arguably, getting evidence E teaches us, among other things, what is justified by evidence E. At least there needs to be an argument that this isn’t the case, and the Bayesian literature isn’t exactly overflowing with it.

The proposal above, while still obviously an idealisation, suggests a model for a theory of learning on which evidential relations are sometimes only learned a posteriori. The idea is that a priori, the set S is the set of all probability functions. After receiving evidence E, we have reason to be less uncertain in Keynes’s sense. So we eliminate many functions, and the resulting credences we assign to propositions are supported by E, as is the knowledge that E supports just those credences.

That’s the first, and I think biggest, reason for taking this model seriously. The other concerns an objection to kinds of dogmatism. Let E be the evidence I actually have, and p something I know on this basis that is not entailed by E. (A wide variety of people, from sceptics to Williamson, believe there is no such p, but this is a very counterintuitive position that I’d like to avoid.) Consider now the material implication _If E then p_, which can be written as _Not E or p_. The dogmatist, as I’m using that term, thinks we can know p on the basis of E even though we didn’t know this conditional a priori.

Now there’s a Bayesian objection to this kind of dogmatism. The objection surfaces in John Hawthorne’s “Deeply Contingent A Priori Knowledge” and is directly connected to dogmatism in Roger White’s recent “Problems for Dogmatism”:http://philosophy.fas.nyu.edu/docs/IO/1180/probsfordog.pdf. The point is that for all q and r, and all probability functions Pr, the following is true.

bq. Pr(~q or r | q) < Pr(~q or r)

So in particular we can prove that

bq. Pr(~E or p | E) < Pr(~E or p)

Now with one extra premise, say that evidence that decreases the probability of a proposition can never be the basis for knowing that proposition, we conclude that if we didn’t know the conditional a priori we don’t know it now on the basis of E. The theory of updating I’m sketching here (though not the particular proposal about U above) provides a way out of this argument. Here’s one way it might go. (I’d prefer an ‘interest-relative’ version of this, but I’ll leave that out because it complicates the argument.)

* It is a necessary condition for an agent to know p on the basis of E that the ideal agent with evidence E assigns a credence greater than some threshold t to p
* An agent represented by a set of probability functions S assigns a credence greater than some threshold t to p iff for all Pr in S, Pr(p) > t

Now it is possible, given the right choice of U, that for some Pr in S, Pr(~E or p) < t, but for all Pr in U(S, E), Pr(~E or p | E) > t. That’s to say that evidence E does suffice for the ideal agent having a credence in ~E or p greater than the threshold, even though the agent’s credence in ~E or p was not greater than the threshold a priori. Of course, it wasn’t lower than the threshold or equal to it before getting the evidence, it was simply incomparable to the threshold. So as well as having a way to model a posteriori epistemology, we have a way to model dogmatist epistemology. Those seem like philosophical reasons to be interested in this approach to updating.