Database Reference
In-Depth Information
(see sidebar), when you examine the last step of the stepwise regression process,
you are guaranteed that all variables in the equation are variables that you want in
the equation (i.e., are signiicant), and there are no other variables out there that you
are missing out on—no other variables out there that you would want in the equation
(that would be signiicant if they entered). You can't do better than this!!!
SIDEBAR: THE BEAUTY (AND CONTROVERSY) OF STEPWISE
The Beauty
There are a few added features that are critical to the stepwise regression technique being the great
technique we believe it is. They are built into the process. We list them below:
1. At each step where the stepwise process is deciding which variable is the best one to add into
the equation, a check is made to determine if the variable would be signiicant with a t-test if
it is entered into the equation. If the best variable is not signiicant (via the t-test), the variable
does not enter the equation and the stepwise process ends. Thus, only signiicant variables are
allowed to enter the equation. (Of course, if the best variable to enter the equation is not signii-
cant, then all the rest of the variables would also be not signiicant if they entered the equation.)
2. As we just discussed, only signiicant variables are allowed to enter the equation. However, a
variable can be signiicant as it enters the equation (say X2), and later, as other variables enter
the equation, the variable, X2, can lose its signiicance. This can happen because each new
variable that enters the equation adds unique information (or it would not be allowed to enter
the equation), but also can, at the same time, duplicate information provided by X2, thus taking
away some of X2's uniqueness and signiicance. If a variable loses too much of its unique-
ness, it may no longer be signiicant, and the stepwise process boots out the variable! Thus,
the stepwise process has a built-in process that will delete any variable that does not retain its
signiicance.
So, the bottom line beauty of the stepwise regression process is that when you examine the
outcome of the stepwise regression process (i.e., the results at the last step):
a. All variables in the equation will be signiicant.
b. It is guaranteed that there are no other variables that would be signiicant if they entered the
equation.
The Controversy
There are folks who are not big fans of stepwise regression. One key reason is that the r 2 value may
be inlated due to chance. For example, if you have 10 X's, and none of them correlates at all with the
Y, then theoretically, the overall r 2 should be zero. However, there are 10 “opportunities” for “false
r 2 ” to enter the analysis. When the stepwise algorithm picks the best variable irst—which is the one
with the highest r 2 —the value of that r 2 will be the highest of 10 different “false r 2 values,” and thus,
even though that irst equation has only one X variable, the r 2 will surely be inlated beyond what one
would expect if there were only one X variable to begin with (i.e., a simple regression).
An analogy may help explain this phenomenon. If you lipped one coin 10 times, you expect on
average to get 5 heads, but could easily get 6 or 7. The chance of getting more than 7 (i.e., 8 or 9 or 10)
heads out of 10 is only about 5.5%. However, suppose that we lipped a dozen coins 10 times
each, and picked, as our result, the number of heads that was maximum (i.e., the highest of the 12
results) . Given that there are now 12 “opportunities” to get 8 or 9 or 10 head, the odds are about
50% (actually, 49.1%) that at least one of the coins would give a result of 8 or 9 or 10 heads. So, for
one coin, the chance of getting 8 or more heads is 5.5%, but for 12 coins, the chance of such a result
occurring at least once is 49.1%—a big difference. The 49.1% value can be said to be inlated , since
it does not represent the chance of 8 or 9 or 10 heads out of 10 lips of one coin , even though you
can pick up that one coin and say, truthfully , that you lipped that particular coin 10 times and got a
result of at least 8 heads.
Continued
 
Search WWH ::




Custom Search