Distributivity of More Probable Than | Thoughts, Arguments and Rants

I’ve thought for a long time that the relation more probable than was not a linear order. That is, it is possible to have propositions A and B such that none of the following three claims hold.

A is more probable than B.
B is more probable than A.
A and B are equally probable.

This isn’t a particularly original idea of mine; it goes back at least as far as Keynes’s Treatise on Probability (which is where I got the idea from).

I’ve also thought for a long time that there was a nice way to model failures of linearity using sets of probability functions. Say there is some special set S of functions each of which satisfy the probability calculus, and then define the relations considered above as follows.

A is more probable than B =_df For all Pr in S, Pr(A) > Pr(B).
B is more probable than A =_df For all Pr in S, Pr(B) > Pr(A).
A and B are equally probable ==_df For all Pr in S, Pr(A) > Pr(B)

This isn’t particularly new either; the idea goes back at least to the 1960s, perhaps earlier. I did have one idea to contribute to this, namely to suggest that this sets of probability functions approach to understanding comparative probability claims was a good way of modelling Keynes’s somewhat hard to model ideas. But now I’m starting to worry that this was a mistake, or at least undermotivated in a key respect.

Note that on the sets of probability functions approach, we can identify probabilities with functions from each Pr in S to a real in [0, 1]. Call such functions X, Y, etc, and we’ll define X(Pr) in the obvious way. Then there is a natural ordering on the functions, namely X >= Y iff for all Pr in S, X(Pr) >= Y(Pr). This ordering will be reflexive and transitive, but not total.

What I hadn’t thought about until today was that there is a natural meet and join on probabilities that we can define. So the meet of X and Y will be the function Z such that Z(Pr) is the maximum of X(Pr), Y(Pr), and the join of X and Y will be the function Z such that Z(Pr) is the minimum of X(Pr), Y(Pr). This isn’t too surprising – it might be a little sad if probabilities didn’t form a lattice.

What’s surprising is that given this definition, they form a distributive lattice. That is, for any X, Y, Z, if we write XMY for the meet of X and Y, and XJY for the join of X and Y, we have (XMY)JC = (XJC)M(YJC). (Or at least I think we do; I might just be making an error here.) That’s surprising because there’s no obvious reason, once you’ve given up the idea that probabilities form a linear order, to believe in distributivity.

Open question: What other interesting lattice properties does _more probable than_ have?

I know that it isn’t a Boolean lattice. There’s no way to define a negation relation N on probabilities such that (a) X > Y iff NY > NX and (b) XMNX is always the minimal element. I think that’s because the only way to define N to satisfy condition (a) is if NX(Pr) = 1 – X(Pr) for all Pr in S, and that relation doesn’t guarantee that XMNX is minimal. But as for other properties, I’m not really sure.

When I was working on _truer_, I spent a lot of time worrying about whether it generated a distributive lattice. Eventually I came up with an argument that it did, but it was very speculative. (Not that it was a particularly original argument; everything about lattice theory in “that paper”:http://brian.weatherson.org/ttt.pdf I borrowed from Greg Restall’s “Introduction to Substructural Logics”:http://www.amazon.com/An-Introduction-to-Substructural-Logics/dp/B000Q66SDG/ref=sr_1_3?ie=UTF8&s=books&qid=1238098686&sr=8-3, which seems to be now out in Kindle version.) It feels bad to simply assume that _more probable than_ generates a distributive lattice simply because the easiest way to model it implies distributivity.