Confounding and Lurking Variables

 Here are two explanations of lurking and confounding variables by David Bock:

An extraneous variable is one that is not one of the explanatory variables in the study, but is thought to affect the response variable. (For some folks extraneous variables are those that aren't explanatory or response variables -- in that case, some extraneous variables are thought to affect the response variable.)

A confounding variable is one whose effects on the response variable cannot be distinguished from one or more of the explanatory variables in the study.

 A _lurking_ variable is one whose effects on the response variable cannot be distinguished from one or more of the explanatory variables in the study, AND IS NOT INCORPORATED INTO THE DESIGN OF THE STUDY.

The difference between lurking and confounding variables lies in their inclusion in the study.  If a variable was measured and included, it's associations between the explanatory and response variables can be determined and (if random assignment was performed) neutralized with methods beyond the AP Syllabus.  It is a confounding (or not) variable.   The associations between an unmeasured variable and the explanatory and response variables cannot be determined -- whatever its associations are remain a mystery, and it "lurks" beyond the purview of the investigator.

********************************************************************** 

Confounding refers to a problem that can arise in an experiment, when there is another variable that may effect the response and is in some way tied together with the factor under investigation, leaving us unable to tell which of the two variables (or perhaps some interaction) caused the observed response. For example, we plant tomatoes in a garden that's half-shaded. We test a fertilizer by putting it on the plants in the sun and apply none to the shaded plants. Months later the fertilized plants bear more and better tomatoes. Why? Well, maybe it's the fertilizer, maybe it's the sun, maybe we need both. We're unable to conclude that the fertilizer works because any effect of fertilizer is confounded with any effect of the extra sunshine. 

Note that this is why we need to blind some experiments: we don't want knowledge of the treatments on the part of those involved in the experiment itself or evaluating the response to be confounded with any actual effect of the treatments under investigation. 

Lurking variables are sometime referred to as "common response". That's where some other variable drives each of the two variables under investigation, making it appear that there's some association between those two variables. A common example is in the strong association between the number of firefighters who respond to a fire and the amount of damage done. One shouldn't conclude that the firefighters may be responsible for the damage; the lurking variable is the size of the fire. 

Lurking variables are the risk we face in sampling and observational studies. In an experiment, though, the factor under consideration isn't being driven by some lurking variable, because we are the ones in control there.