Information Technology Reference

In-Depth Information

2.

And plug in the following data:

P(cancer) = 2.6% (260 out of 10,000 have cancer)

(5)

P(~cancer) = 97.4% (9740 out of 10,000 have no cancer)

(6)

P(positive x-ray | cancer) = 85%

(85% of people with lung cancer have positive x-ray)

(7)

P(positive x-ray | ~cancer) = 6%

(6% of people without lung cancer have positive x-ray)

(8)

3.

And plug in the above data into the above Bayes' theorem, we will get:

P(cancer | positive x-ray) = 85%* 2.6% / (85%*2.6%+6%*97.4%)

= 0.0221 / (0.0221+0.0584)

= 0.0221 / 0.0805

= 0.274

As you can see, comparing to the non-risky population (the probability of having cancer

0.028), the probability value of 0.274 of a person in the risky group is much higher. This makes

sense since the prior probability of getting lung cancer is higher in this high risk group. In this

new example, the quality of the x-ray equipment does not change. The only thing changed is

the prior cancer rate, from 0.2% to 2.6%. At first look to the new problem, most people will

give the same wrong answer of 85%. But Bayes' reasoning gives us more objective and correct

answer. Here is an example that computer reasoning can be better than a human!

Bayes' reasoning can be used in situations that have multiple evidences. Let's use Example

2, which is the extension of Example 1, to illustrate how this is done.

Example 2:
“Lung cancer is the leading cause of cancer death in the United States.”

(Williams, 2003, p. 463) Suppose that about 0.2% of the population living in US with age

above 20 has lung cancer. When doing an annual check, assume that 85% of the people with

lung cancer will show positive for the chest x-ray test. On the other hand, chest x-ray will

have false alarms: 6% of the people without lung cancer will also show positive for the chest

x-ray test. Suppose that a hospital will do two lung cancer screen tests for each annual check

patient (assume the two tests are independent). The second test called CT scan is done to

improve the accuracy of diagnosis. Suppose that the CT scan has the following

characteristics: it returns positive for 85% of the people with lung cancer; it has a lower false

rate than the x-ray test and will return false positive for one out of one thousand people

without lung cancer. If a person went through the annual check and had positives on both

the chest x-ray and the CT scan, what is the probability that he/she has the lung cancer?

Answer:
We can solve this problem by using the Bayes' theorem twice. We already know

that the probability of a person has cancer given that he has positive x-ray is 2.8%; the

probability of a person has no cancer given that he has positive x-ray is 97.2%. We can use

this result and continue to solve the problem as follows: