LOYER’S PARADOX
Consider
trying to determine the probability of getting a hit when randomly selecting
from two possible batters. Each has 100
at bats, while A has 26 hits and B has 36 hits.
In probability notation
P(A gets a hit) =
26/100 = 0.260
P(B gets a hit) =
36/100 = 0.360
and
the probability that a random selection between the batters produces a hit is
P(hit) = (1/2)(0.260) +
(1/2)(0.360) = 0.310.
But
suppose each batter’s performance against left-handed and right-handed pitchers
is as follows.
against against
batter left-handers right-handers overall
A 24/80 = 0.300 2/20 = 0.100 26/100
= 0.260
B 12/50 = 0.240 24/50
= 0.480 36/100 = 0.360
Now
the probabilities that a random selection between the batters, conditioned on
the throwing hand of the pitcher, produces a hit are
P(hit|left-hander)
= (1/2)(0.300) + (1/2)(0.240) = 0.270
P(hit|right-hander)
= (1/2)(0.100) + (1/2)(0.480) = 0.290
How
can the unconditional probability of getting a hit be larger than both of the
conditional probabilities? That is, how
can the probability of getting a hit be 0.310 when the throwing hand of the
opposing pitcher is not known but drop to 0.270 if the pitcher is known to be
left-handed and drop to 0.290 if the pitcher is known to be right-handed?
RESOLUTION OF THE PARADOX
(1)
If the throwing arm of the opposing pitcher is not known, or is not considered
relevant, then P(hit) is properly 0.310.
(2)
If the throwing arm of the opposing pitcher is known, and considered relevant,
then P(hit) is properly either 0.270 or 0.290.
The
probabilities in (1) and (2) are obtained under two different assumptions and
consequently cannot be directly compared.
This
is an example of how pooling results from two fundamentally different data sets
(left-handed and right-handed pitchers) to create a single data set (overall
performance) can be artificial and give inappropriate results. It is not correct to say, for example, that A
is a 0.260 hitter: he is either a 0.300 hitter or a 0.100 hitter, depending on
whether he is facing a left-hander or a right-hander – and his “overall”
performance can be made to be anything between 0.300 and 0.100 by manipulating
his number of at bats before each type of pitcher.
Another
example of this principle is the relationship between human weight and time to
run a 100 yard dash. For 100 college
students selected at random there will likely be a significant negative
correlation between weight and time (i.e., the heavier students will have the
shorter times). But when the males and
females are considered separately, there will likely be a significant positive
correlation between weight and time for each gender (i.e., the heavier students
within each gender will have the longer times).
Because males tend to be heavier than females and because males tend to
be faster than females, pooling the two genders into a single date set gives
artificial and inappropriate results.
Loyer’s Paradox and its
resolution are discussed at length in the article “Can the Probability of an
Event Be Larger or Smaller Than Each of Its Component
Conditional Probabilities?” by Loyer and Sprechini in CHANCE Vol. 24, No. 1 [Winter 2011] pages
44-53. CHANCE is a magazine of the
American Statistical Association.
Note:
Simpson’s Paradox also involves the pooling of data sets to produce a seeming
inconsistency. The CHANCE article
demonstrates that Loyer’s Paradox is conceptually
different from Simpson’s Paradox, and that no set of data can exhibit both
paradoxes simultaneously.