The Pro Rata Rule and Mellor's Justification of Bayesian Conditionalization

In this blog, I summarize one of D. H. Mellor's arguments for the rationality of Bayesian conditionalization [1]. Suppose we assign equal probabilities to the outcomes of a die roll (where Pr(1) is read as, "the probability of rolling a 1"):

(1) Pr(1) = 1/6; Pr(2) = 1/6; Pr(3) = 1/6; Pr(4) = 1/6; Pr(5) = 1/6; Pr(6) = 1/6

The values sum to 1, respecting normalizability constraints. Now, suppose we want to know the probability of the following claim:

(A) the result of a throw is an odd number. 

Applying the special disjunction rule, we have:

(2) Pr(A) = Pr(1) + Pr(3) + Pr(5) = 1/6 + 1/6 + 1/6 = .5

It's just as likely that we'll roll an odd as an even. The six possible outcomes constitute a distinct sample space, {1, 2, 3, 4, 5, 6} -- call it 'Ω*'. Now, we can think of {1, 3, 5} as constituting a different sample space, Ω, from Ω*. The probability of rolling an odd number, given our knowledge of the original sample space, is .5. Call this the prior probability of A. We might wonder, however, what the probabilities of each sample point in Ω are. That is, how likely are we to roll a 1, assuming the outcome is an odd number? Let Pₐ(1), Pₐ(3), and Pₐ(5) represent the probabilities of each sample point in Ω. Whatever the values of Pₐ(1), Pₐ(3), and Pₐ(5) are, they cannot be the same as the original probabilities ― their values in Ω* ― otherwise they would not sum to 1 (violating normalizability). Intuitively, however, the new probabilities of 1, 3, and 5 should be related to their old probabilities somehow. We can't just assign new and arbitrary values. Instead, we need a way of updating their new probabilities in a way that preserves the relative values of 1, 3, and 5. 

An intuitive way to proceed is to apply what Mellor calls the pro rata rule. The idea is that the new values of 1, 3, and 5 should be proportional to their original values in Ω*, distributed over the sum of the probability of rolling an odd number. In other words, we divide the probability of each member of the sample space by the probability of the entire sample space. This is the same way we determined the values of each member in the space {1, 2, 3, 4, 5, 6}. This gives us the following:

Prₐ(1) = Pr(1)/.5 = (1/6)/.5
Prₐ(3) = Pr(3)/.5 = (1/6)/.5
Prₐ(5) = Pr(5)/.5 = (1/6)/.5

Updating in this way yields the following results:

(4) Pₐ(1) = 1/3;  Pₐ(3) = 1/3; Pₐ(5) = 1/3

This seems like the right way to update probabilities, right? The relative values are preserved and they sum to 1. Now get this: if you think this is the correct way of updating probabilities, then you ought to think Bayesian conditionalization prescribes the right way of updating, since it involves the exact same procedure! Neat! Consider Bayes's theorem:

Pr(B|A) = Pr(A&B)/Pr(A)

= Pr(B)Pr(A|B)/Pr(A)

Let 'B' = '1 is rolled', and let 'A' = 'either 1, 3, or 5 is rolled'. Given that we have no reason for thinking that 1 is more likely to be rolled than any other of the five possibilities, we should believe that Pr(B) = 1/6, Pr(A) = .5, and Pr(A|B) = 1. Hence, according to Bayes's theorem:

Pr(B|A) = (1/6 × 1)/.5

= (1/6)/.5

= 1/3

This involves the same procedure and delivers the same result as the pro rata rule. The same occurs for Pₐ(3) and Pₐ(5). Neat! If this increases your confidence in Bayesian conditionalization as a heuristic for updating one's beliefs/credences...well, neat! 

The word of the day is "neat."
_____________________________________________________________________
Footnotes:

[1] See Mellor's helpful introduction to probability theory, Probability: A Philosophical Introduction.

Comments

Popular Posts