Information Technology Reference
In-Depth Information
P(cancer) = 0.2% (20 out of 10,000 have cancer)
(1)
P(~cancer) = 99.8% (9980 out of 10,000 have no cancer)
(2)
P(positive x-ray | cancer) = 85%
(85% of people with lung cancer have positive x-ray)
(3)
P(positive x-ray | ~cancer) = 6%
(6% of people without lung cancer have positive x-ray)
(4)
And plug in the above data into the above expression, we will get:
P(cancer | positive x-ray) = 85%* 0.2% / (85%*0.2%+6%*99.8%)
= 0.0017 / (0.0017+0.06)
= 0.0017 / 0.0617
= 0.028
This is exactly the same answer we got in the previous section.
Bayes' reasoning needs three pieces of information (all appear on the right of the equation at
the beginning of step 5): the percentage of people with lung cancer, the percentage of people
without lung cancer who have false alarms, and the percentage of people with lung cancer
who show positive on the test. The first piece of information which is part of the priors is the
baseline knowledge. The second and third pieces of information which also belong to the
priors are the measurement of the quality of evidence. Bayes' reasoning is to use the
evidence to change the belief/knowledge (shifting the baseline upwards with positive
evidence or downwards with negative evidence). We will use more examples to show how
this change of belief (the machine reasoning) happens. The left-side probability is the
posterior probability. It is the revised view of the world in the light of evidence which is on
the right-side of the equation.
To see how the first piece of information affects the Bayes' result, let's assume that the batch
of people doing the annual check is high risk smokers. According to Williams (Williams,
2003, p. 464), smoker's chance of getting lung cancer is 13 times higher than non-smokers.
Now, let's ask the same question: what is the probability of the person has lung cancer if
he/she has the positive x-ray test given that the cancer rate in this group is 2.6% (2.6% is
getting from 0.2* 13)? Sure enough, the final answer should be different. Actually, the new
answer is 27.4%. The following is the analysis and steps showing how we get the correct
answer:
1.
We use the Bayes' theorem:
(|−)
=
P(−|)∗()
P(−|)∗()+(−|~)∗(~)
Search WWH ::




Custom Search