Sometimes the term “bad science” gets immediately thrown out when new results make the news and a group of people happen to disagree with the findings. Sometimes they are wholly justified in making this claim, but not always. As someone who aims to disseminate the research, it’s important to make a distinction between what is “bad” research and what is something altogether different. I will be the first to admit that I have likely been bad about this distinction myself and I hope it’s something I will be better at going forward because I truly believe it’s something important.
What Makes Something “Bad Science”?
Before I delve into the explanation, let me say that I base my criteria on my work running quality analyses on studies, both as a graduate student and during my time working for the Canadian Council on Learning where my primary job (along with some statistical work) was to make the decisions on what quality research was. The criteria used are pretty standard so this isn’t anything new, though there are considerations that don’t always make it into reviews that I believe should, based on my training in psychometrics and statistics.
- Did the researchers assess what they wanted to assess or claimed to assess in terms of variables?
- Did the researchers use appropriate statistics to analyze the data?
- Did the researchers use the right sample for the question of interest?
If the researchers included appropriate variables in their analyses, used the right statistics, and used the right sample, it should not be filed under “bad science”. If any of these three criteria were less than optimal then we run into problems with how to interpret the data and thus the conclusions. The further off the research is, the worse it becomes.
Taking recent research on breastfeeding, for example, many studies fail at criteria 1 and/or 3 – the variables and sample. If researchers only have access to “breastfed” or “not breastfed” then the only question they can answer is if a small amount of breast milk makes a difference in outcomes, yet this is often not what should be analyzed. If we want to know if not breastfeeding results in changes to human development, we need to consider how breastfeeding is determined which should be about biological breastfeeding which is defined by exclusive breastfeeding for a period and then complementary breastfeeding for 2 years or beyond. If we don’t have that group or at least the details of breastfeeding exclusivity and duration, then the answers we get aren’t really addressing the questions of interest.
Another example is sleep training. When a study hit the media claiming no long-term effects of extinction sleep training, those of us who understood statistics immediately saw huge problems with the research as it was conducted. In this case, the problems were with the first and third criteria. First, the outcome variables were far from ideal as they were both parent-report and not in areas that one might expect negative consequences (like sleep behaviours and stress responsivity). Furthermore, temperament, which is known to interact with parenting methods, was not assessed at all. Second, the analyses used an intent-to-treat model which meant that the groups were ill-formed. Half of the intervention group (the supposed sleep training group) refused to engage in controlled crying whereas nothing was measured for the control group and other research suggests approximately half of families attempt controlled crying on their own. The only difference, then, was in the information given to the intervention group on “normal” sleep (which was also not quite accurate information). These problems are what led many, including myself, to call this “bad research”.
Even more comprehensive research techniques, like meta-analyses, are not immune to this problem. When Dr. Carpenter and colleagues conducted a meta-analysis on the risk of bedsharing and concluded that it was risky even in the absence of smoking, many researchers in the field of infant sleep and specifically bedsharing, were rightfully upset and confused. Why? Because of the decision of which data files to include in the meta-analyses. The ones that were chosen had some rather severe flaws as they were collected years ago before we understood more about the factors that create an unsafe sleeping environment. Some didn’t include alcohol consumption and those that did had very poorly defined variables (e.g., did you have anything to drink the day before?) whereas most didn’t include bedding type either, another factor that has been found to be as important as smoking to the risk of suffocation or SIDS. Even the analyses were flawed as some of the methods used – like imputation to make up for missing data – were used when the underlying assumptions weren’t met.
Now, it’s never quite so simple as to say it’s all “good” or all “bad” as there are varying degrees of good and bad and most research falls in between. Take the current focus the role of SES in breastfeeding research. Early research didn’t know this and so can’t be faulted for not including SES as a confounding variable and wouldn’t be considered “bad”, but it does mean that the findings must be taken with a grain of salt. However, new research that fails to include it ought to fall under “bad” because we know that this is an important construct. Yet even research that includes it won’t be perfect because statistical controls can only do so much (though they do more than many give them credit for). Is the research “bad” because it can’t create randomly assigned groups? No. Sometimes there are just limitations in research and we simply have to acknowledge them until we find better ways to run our research.
Bad Outcomes versus Bad Conclusions
Another problem comes when the researchers themselves draw conclusions that are not warranted from otherwise good (or good enough) research. This often happens when researchers attempt to fit their data to some political end or hot topic in the media, but really their findings don’t actually support what they are saying. I saw this regularly in the educational research and sadly it also has occurred in research that covers more prominent parenting topics.
Most recently, research on finding certain chemicals in breast milk led one of the researchers to make outlandish claims about the dangers of breast milk and how babies should be weaned after only 3-4 months. Nothing in the research suggested anything of the sort as the research never actually addressed the issue of long-term health outcomes based on these particular levels. But was the research “bad”? For the question of interest in the paper itself, no, it was actually a decent study. Not perfect as the sample has some biases (like eating a lot of whale meat and the timing of when the data was collected), but for the question of bioabsorption of certain chemicals by looking at breast milk, the research did a decent job of answering that (though questions still remained and holes were there).
Another example is when sleep training was promoted in the media based on research by Dr. Weinraub and colleagues. Here, again, certain researchers decided to speak about sleep training when the research actually didn’t look at sleep training at all. It looked at normal night waking patterns in a group of children and simply provided us with normative data from which we could see that night waking is actually a very biologically normal act, even when children are as old as 3 years of age. The research as it was – looking at normal sleep patterns – was very good, but the use of this research to promote particular parenting techniques was not.
This type of research then becomes difficult to promote because people can see it cited for one reason, read it then take away a totally different message. Unfortunately, the abstract conclusion is often one that isn’t supported by the actual research. I do wish peer review would eliminate this, but one doesn’t always get a peer reviewer who is well aware of the nuances in data collection or who may share a similar bias, thus overlooking the fact that the conclusions don’t match the data. Of course, one of the problems is that all researchers want their research noticed and nuanced data doesn’t get headlines, only strong claims of a political nature does.
The Good, the Bad, and the Preliminary