Mixtures of Conditional Probability Functions

It’s well known that it’s easy to ‘mix’ two unconditional probability functions and produce a third unconditional probability function. So if x ∈ [0, 1], and f1 and f2 are both unconditional probability functions, and for any proposition p in the domain of both f1 and f2, f3(p) = xf1(p) + (1-x)f2(p), then f3 will also be an unconditional probability function. (This is really immediate from the axioms for unconditional probability.) I thought the same kind of thing would work for conditional probability, but I can’t figure out how to do it.

It’s certainly not true that if f1 and f2 are both conditional probability functions, then the function f3 defined by f3(p|q) = xf1(p|q) + (1-x)f2(p|q) will be a conditional probability function. Here’s a counterexample.

  • f1(A | BC) = 0.3
  • f1(B | C) = 0.4
  • f1(AB | C) = 0.12 (a consequence of previous two posits)
  • f2(A | BC) = 0.5
  • f2(B | C) = 0.6
  • f2(AB | C) = 0.3 (again a consequence)
  • x = 0.5

If we just apply the above formula, we get this

  • f3(A | BC) = 0.4
  • f3(B | C) = 0.5
  • f3(AB | C) = 0.21 (inconsistent with previous two lines, if f3 is a probability function)

One natural move is to say that when f1(q) = f2(q) = 1, then f3(p|q) = xf1(p|q) + (1-x)f2(p|q). That will deliver something that is a conditional probability function as far as it goes, but it won’t tell us what f3(p|q) is when f1(q) = f2(q) = 0. And I can’t figure out a sensible way to handle that case that doesn’t run into a version of the inconsistency I just mentioned.

It feels like this is a simple problem that should have a simple solution, but I’m not sure just what it is. There’s a lot of information about mixing probability functions in this paper by David Jehle and Branden Fitelson, but it doesn’t, as far as I can see, touch on just this issue. Any suggestions would be appreciated!

2 Replies to “Mixtures of Conditional Probability Functions”

  1. A quick thought: the def. of f3 defines a convex mixture of f1 and f2 but, owing (perhaps) to the nonmonotone character of the second coordinate of the conditional probability function, nothing constrains f1 or f2 simpliciter to yield values via the chain rule that must be in this convex hull defined by f3.

    Notice that f3 will work just like the marginal case when the conditioning event, q, is fixed; the problem is that in one pair of cases we are conditioning on C, another pair the joint BC.

    Not sure of what you want to do with the construction, but making sure convexity is satisfied might do the trick. This may mean that admissible values will be intervals, however. Another idea, again depending on what you’re after, is to restrict the conditional probability functions to the class of monotone functions (positive or negative), so that you can get a handle at least on which direction you’ll always go in when going from C to the joint BC.

  2. It depends on what conditions you impose on conditional probability functions, and how you expect the mixture to behave. If you are talking about lexicographic functions (so that each one is given by a sequence of unconditional probability functions, and F(p|q) is defined as f(p and q)/f(q) for the first f in the sequence where f(q) is non-zero) then one natural way to do it would be to just mix the corresponding functions in the sequence. But this will have some strange effects on probabilities conditional on q, if q has a non-zero value somewhere earlier in one sequence than in the other. (I think the conditional probability in the mixture will just be equal to the conditional probability in the one that gives q its first non-zero value, and it will be a mixture of the conditional probabilities whenever the first non-zero value occurs at corresponding points in the sequence.)

    Joe Halpern shows how to translate lexicographic probabilities and Popper functions into one another, though I think the translation might be non-unique in one direction, depending on what structural assumptions you make.

Leave a Reply