Confounding Variables

Confounding variables are hidden EVERYWHERE. It’s likely that you’ve come across them in the wild, possibly every day, and perhaps without even realizing it. I’ll explain why confounding variables are problematic and how to keep an eye out for them, but before we go any further, let’s review the definition of confounding variables.

What are confounding variables?

Data with DJ Confounding Variables

Confounding variables, or confounders, are factors that affect the subject we’re interested in but aren’t actually part of the study or even being actively observed. In fact, they might even be things we’re totally unaware of. Confounding variables can lead to perceptions of cause-and-effect relationships (or, more often than causation, correlation) that don’t actually exist. In other words: though it may appear that two factors are correlated (e.g. when one value increases the other changes either positively or negatively but consistently so), there could actually be other factors, or confounders, responsible for the changes in both variables individually.

Let’s take a closer look at an example.

Confounding variables example

What if I told you that increased ice cream sales lead to more shark attacks. No, really!

When more ice cream is sold, more shark attacks happen. Look at the graph. The effect is right there in front of you. You can see it clearly with your own eyes.

So does this mean that we must urgently outlaw the sale of ice cream to save lives? Not so fast.

First, let’s think of what other factors might be at play. This is a technique we’ll explore later below, but for now start by asking yourself: “what else could be going on here?”

For starters, people tend to eat more ice cream during warm, summer months. In the warm, summer months, there are also more people at the beach and going into the ocean. The ocean also happens to be the home of the shark.

So, what do these things have in common? They both happen more during the warm, summer months. Though perhaps it could be the case that sharks just *love* ice cream, it’s more likely that it is not ice cream sales that lead to shark attacks (nor shark attacks that lead to ice cream sales for that matter). Instead, hot temperatures cause both more demand for ice cream and more demand for beach time and ocean swimming.

Hot temperatures in summer months are thus said to be a confounding factor for both ice cream sales and shark attacks. In other words, hot temperatures increase both ice cream sales and shark attacks. If we do not properly consider hot temperatures as a factor in our analysis, we might think we find a causal relationship between ice cream sales and shark attacks when there in fact isn’t one.

Why do confounding variables matter?

Confounding variables can have a major impact on your understanding of what’s really going on. They can hide the true link between two variables. Alternatively, they can create a link where none exists.

You may have heard the saying “correlation does not equal causation”. That is very much true, which is why researchers go to such great lengths to control for confounding variables. They can impact study results and analysis, so it’s important to know what they are and how to spot them so you can watch out for them. Studies that control for confounding variables are able to help us better establish and understand the underlying cause behind the subject in question.

Confounding variables can lead us to:

Overestimate the cause of an effect.
Underestimate the cause of an effect.
Hide an effect that exists.
Create an effect that doesn’t exist.
Change or reverse the true direction of an effect (from positive to negative, or vice versa).

With confounding variables, we need to stay on our toes to make sure we’re not falsely attributing an effect to a certain variable or process where none exists. This can be frustrating, as I show below.

I know it can be hard to let go of what you think is a real and strong effect whether in your own research or in a news headline you’ve stumbled upon. It can be hard to come across strong causality or even correlation in the wild, so we want for it to be true. Unfortunately this love story may feel like it ends in heartbreak, but it doesn’t. Once we accept the existence of confounding factors, we can move past it and get to the bottom of what’s really going on.

Confounding variables can invalidate the conclusions you come across. The conclusions you draw won’t actually be a reflection of reality. Instead the true relationships will be distorted or masked. If we aren’t aware of the potential for confounding variables, our understanding of what’s really going on can be distorted.

How to identify confounding variables

We can identify potential confounding variables in a few ways. For starters, just as we did in the ice cream sales vs shark attacks example above, you can begin with a thought experiment. Simply ask yourself: “what else could be going on here?” From there, you can create a list of other factors that could be behind the effect you’re seeing and in turn test them out.

Want to sharpen your skills at identifying potential confounders? Here’s one of my favorite mental exercises.

Open up the science, health, or nutrition section of any news outlet. Pick an article with a headline that sticks out at you. When you do that, try out the following thought experiment. Ask yourself: “what other factors could be responsible for what’s going on here?” Try to make a list of the potential confounding variables that could really be at play.

For example, here’s one I came across in writing this post:

What else could be going on here? Perhaps it is truly the case that someone’s bedtime is the sole determinant of whether they underperform at work. Or perhaps people who are early risers have other habits aside from just the time at which they wake up that are conducive to stronger workplace performance. Additionally, note that they haven’t defined what they mean by “underperform”. Underperforming by 0.1% is still technically speaking underperforming in a relative sense, but is it significant and meaningful? This is yet another subtle distinction we could miss on a quick first glance but that impacts the meaning entirely.

As shown above, ask yourself questions such as:

“What assumptions am I making here?”
“How is this insight being expressed?”
“Are there any words that are ambiguous and could be influencing my interpretation?”

What’s actually the truth here? Well, I don’t know. We’d have to read the study methodology for starter’s. But even without reading the study itself or truly getting to the bottom of it, by actively trying to brainstorm potential confounding variables you can help stop yourself from being swept up in something that may not actually be supported.

In addition to thought experiments, you can look at past research and other studies on the topic of interest to see whether they have listed other potentially confounding factors you may have not considered. As you review past studies on the subject matter you’re researching, think about and identify possible confounding variables that have been laid out by other subject matter experts. [Side note: check here to see even more confounding variable examples].

Oftentimes you can’t fully get rid of confounding variables. But you can still do a few things to minimize their effects. Let’s discuss some of the ways you can put this to use.

How to control confounding variables

Now that you’ve learned some ways to identify potential confounding variables, let’s look at how to address confounding variables in the data collection and analysis process. Our goal in this is to establish a control group, or a baseline to which we can make comparisons. We do that in primarily two ways: in our experiment design and in our analysis of results. But before we get to that, why is having a control group so important in the first place?

We aim to decrease the effects of confounding variables with these methods so we can understand what’s really going on. With a control, we’re able to establish a baseline, and with a baseline, we’re able to compare how the factor we’re actually curious about is affected by the factor we’re focusing on.

Even if that baseline isn’t an absolute baseline, i.e. there are some factors we’re either unable to control for that affect the base value we see, it can still be a baseline in a relative sense. We can still gain insights based on relative differences of treatment vs no treatment, and that’s good enough for us.

Let’s explore them in more detail.

1. Experiment design

Ideally we’d like to take confounding variables into account as early as possible, preferably in the design stage of research, so they don’t make their way into the data in the first place. How do we do that?

The main ways of controlling for confounders in your study design are randomization, restriction, and matching. These methods are all attempts at establishing a control in one way or another. In other words, we use these methods to identify and limit the effects of potential confounders such as individual differences like age, nationality, cultural expectations, socioeconomic background, seasonality etc. Let’s discuss how each of these can help below.

Randomization

What is randomization?

Randomization is the process of assigning your study subjects to treatment groups by random assortment with equally likely odds. This in turn gives a very high chance that the study groups will be similar across factors such as age, gender, cultural background, etc. In theory, this should give any subject an equal chance of being in any given treatment group and in turn make each group more or less equivalent.

How does randomization reduce confounding?

The randomization process gives us groups that are comparable across a range of factors. So even though a confounding variable might still be present in the study, the effect it has on all groups should be similar. This means that although the absolute values in each group might be affected by confounding factors, the relative differences between the two groups and their treatment effects would be valid and valuable.

Randomization is especially useful in that it accounts for both identified and unidentified possible confounding factors. In other words, if the participant groups are roughly similar in their mix of, say, age, geographic location, etc, then it’s likely that any external factors

Randomization example

Let’s say you’re running a study to figure out which Meme Lord is the one Meme Lord to rule them all, i.e. which one has the funniest memes.

Meme Lord A Data with DJ Meme Lord vs Data with DJ Meme Lord Disguised Meme Lord B

You manage to enroll 3000 study participants. Some of these participants are Gen Z, some are Millennials, and some, well, some are Boomers (but that’s OK).

Meme humor preference may vary significantly across generations. If one study group has a lot more Boomers than another for example, we might be led to believe that certain memes are funny to everyone when really they are just funny to Boomers. And that’s not OK.

So, what to do? With randomization, we ensure that each participant has equally likely odds of ending up in each group. In turn, we can expect that the number of Gen Z, Millennials, and Boomers will be roughly the same across the groups. Our samples will generalize to populations, and we can solve this debate once and for all: who is the one Meme Lord to rule them all!?

Restriction

Next up we have restriction.

What is restriction?

Restriction is the process of removing people who are affected by a potentially confounding variable from the experiment. If you only include men in your study for example, then you remove sex as a confounder. If your study only includes people 65+, then you reduce age as a confounder, and so on.

You can start by identifying potential confounding variables beforehand, as mentioned above. Once you have a list of potential confounders in hand, you can then use them to inform the restriction phase of your study design.

How does restriction reduce confounding?

Restriction limits what kind of characteristics are chosen for, or excluded from, a study to begin with. By removing variation, you can remove the confounding potential of that variation.

Restriction is common in health trials in which many studies are run on just men, excluding women.

Note that your results, because they are not performed on the whole population and all genders, will not be a representative sample. A non-representative sample will not be generalizable to an entire population which is a limiting factor here.

Restriction example

What would restriction look like in our Meme Lord showdown example from above? Well, we already know that what people find funny varies across generations. So a Gen Z, a Millennial, and a Boomer will on average laugh a different amount, or maybe not at all, at a given topic.

Instead of taking an average score of what’s funny across all generations and their variations, we can instead decide to focus our study on just one generation, e.g. we can restrict it to just Boomers. In doing so, we remove “generation” as a point of variation within and across groups.

There is one catch though. Because our study was just done on Boomers, it’s not OK to generalize our findings to all generations. In other words, our study findings can tell us who Boomers consider their one Meme Lord to rule them all but not who the universal Meme Lord is because the other generations were not included in the study.

Matching

The next technique to reduce confounders in study design is matching.

What is matching?

Matching people with similar traits such as age, sex, socioeconomic factors, etc is another approach to reducing the effects of confounders. This is often referred to as a matched case-control study. You then give one of the matched pairs the treatment and the other nothing. If, for example, formal education level is a trait in your study, you’d pair two people with PhDs together and place Person 1 in the control group and Person 2 in the treatment group.

How does matching reduce confounding?

Matching reduces confounding by providing a basis for control on a more narrow range of characteristics. When confounders are removed, we can conduct a more apples-to-apples comparison.

Matching example

If we zoom in on each generation from our Meme Lord Showdown, we’ll uncover even more differences amongst them. Even within a specific generation, people live in several cities, have different careers, and have a range of household sizes too (some live alone, some live with families, some live with friends, etc). These differences are another source of variation across our study groups.

Let’s assume all groups have the same amount of Gen Z representation. Sounds like a great start for removing bias, no? But what if in one group 10% of them live at home with their parents and in another group 50% of them live at home with their parents? This could affect our results in a systematic way if living at home is a confounding variable for meme preference. The memes Gen Z find funny could depend on something that has to do with them living at home. Perhaps their sense of humor has degraded due to extra years of exposure to dad jokes at home. There are many factors u c.

So, how can we further control for variation? Through matching. If we have two groups and two Gen Z who both live at home, in the same city, and are both students, we consider them a matched pair and put one into Group A and one into Group B. Then we let the Meme Lord Showdown begin.

Note: there are downsides to overstratifying your data.

These are things to consider during the experiment design phase. But what if you need to make sense of data from an experiment that has already been run?

2. Adjustment in analysis

Is there anything we can do to make sure we don’t fall victim to confounding variables even after a study has been conducted? Thankfully there is: in the analysis stage.

Except in specific cases, the majority of us aren’t or won’t be involved in study design. But we can still learn how to sift through the data to protect ourselves from being misled by confounders. For starters, just being able to be aware of and identify potential confounding factors is a crucial first step to making sure you don’t get misled by confounders.

When interpreting a result that has already been carried out or an observational study, you can monitor for confounding in your analysis in the following ways:

Stratification

What is stratification?

Stratification is a way of sorting people or subjects into groups. These groups are created such that potential confounders do not vary among them. For example, you could create groups by household income, putting everyone in a certain household income range together in one group, or by age, grouping Gen Z study participants together with other Gen Z study participants, Boomers with other Boomers, and so on.

How does stratification reduce confounding?

If we create groups, or strata, such that any potential confounding variable, identified or otherwise, affects both the control and the exposure groups equally, then we can feel confident that the groups won’t vary based on confounding factors. Although confounding variables may still affect these groups in an absolute sense, we can still compare them to each other in a relative sense and trust that our conclusions hold.

Stratification example

Our Meme Lord Showdown has now been completed, and we have the data. Unfortunately, randomization, restriction, and matching weren’t incorporated into the study design in the end. Are our hopes of figuring out who the one true, unbiased Meme Lord to rule them all ruined? Or can we still find our answer in the data?

Thanks to stratification, there is still hope that we can work through and around potential confounding variables. Let’s say we want to narrow down the data and figure out which Meme Lord is funniest to each generation rather than to the group as a whole. How do we do that?

Within each group, we have a mix of Gen Z, Millennials, and Boomers. Each participant has a Total LOL Score. Instead of evaluating each Meme Lord by group, combining all generations, we break each group into subgroups, or strata, by generation. For Group A, we now have three subgroups: Group A – Gen Z, Group A – Millennials, and Group A – Boomers. These are our strata. Now we can take the Total LOL Score by generation and come up with a combined LOL Score for each subgroup. We repeat this same process for Group B as well.

What do we find? It turns out that, while the Supreme Meme Lord Champion, who had a higher score based on Group A, won over the people as a whole, Boomers as a group actually found the other Meme Lord from Group B funnier. That is to say, Group B’s Boomers had a higher combined LOL Score than Group A’s Boomers even though the Meme Lord in Group A won over the people as a whole with an overall higher LOL Score.

What Next?

I hope that you now have a more clear understanding of what confounding variables are, why the matter, and how you can think about them in a more structured way. I also hope you have a few actionable techniques you can use to help you keep your eyes peeled for confounding variables.

Want to stay in the loop for more data and statistics topics? Sign up to my email list here or follow along over on TikTok or Instagram for more!