How big is the biggest apple you could buy from your favorite supermarket? Surprisingly enough, you can actually give a reasonable answer just buying a bag of apples.
In the example below, I weighed 8 apples and created a histogram with 5 bins in the range 100 – 150 g using CERN ROOT.
TH1D * h = new TH1D("h","h",10,100,150) h->Fill(108) h->Fill(120) h->Fill(124,3) h->Fill(126) h->Fill(127) h->Fill(129) h->Fill(130) h->Fill(131) h->Fill(147) h->Draw()
Then, I fitted the distribution with a gaussian function:
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 Constant 2.69517e+00 1.33544e+00 5.74579e-04 -1.73450e-04
2 Mean 1.26307e+02 6.92056e+00 3.74200e-03 5.77260e-05
3 Sigma 1.40348e+01 6.63670e+00 8.65874e-05 2.66500e-04
With these parameters, we can answer the question. Let’s say that it’s quite uncommon to find an apple that is 3 standard deviations away from the average, which corresponds roughly to 1 case in 1000. How much does it weigh? We can use the z-score to calculate this value:
z = (x - avg) / stdev = (x-126.3)/14.0 > 3
x > 3 * 14.0 + 126.3 = 168.3 g
To find such an apple in this kind of bags seems to be quite unlikely, but how about the most utterly uncommon one that we may expect to find from this producer? Usually, the threshold is set to 5 standard deviation:
X = 5.0 * 14.0 + 126.3 = 196.3 g
Now, think about it. Most of the apples you can buy from this supermarket have a weight within just 14 grams around the average. Do they have trees that make apples so precisely, or is there a selection bias? The most unlikely apple you can find is just twice as big as the common ones, not 10 times or more!