Scientific Paper Format: Some Stuff to Know

One problem with science for many people is that it’s boring. I find it fascinating, but many people, if they don’t just snooze or walk off, think you’re being a pedant if you want to talk about how, exactly, the scientific sausage is made and what, exactly, can be reasonably concluded from it. Too bad – you want to talk science, you’ve got to talk details and methods, rules and honesty. You have to know how to throw to play baseball, how to knit to knit sweaters, how to speak Swahili to talk to somebody in Swahili. If you want to talk science, but can’t or won’t do the required work, tough. Have the integrity to bow out.

I bow out when the math gets too heavy or the methodology too technical for me to follow. That actually doesn’t happen too often in real life, as the issues are generally earlier in the process. It’s also true that, in popular culture, it is rarely the case that disputes about science are about the validity of advanced probabilistic analysis or the inner workings of a linear accelerator. For example, I haven’t looked at the stochastic aspects of Ferguson’s COVID model, and probably would find them difficult to understand, and so have nothing to say about them in and of themselves. The problem isn’t the fancy math, it’s the use of models as if they provide definitive evidence, and the misuse of the actual evidence by that model.

Sorry about that, just needed to get that off my chest. Onward: Like almost everything I know, I developed an understanding of the structure of scientific papers by – sitting down? – reading scientific papers. Crazy, huh? Goes a little something like this, with plenty of variation:

  • Abstract: quick summary of the question you’re addressing, the methods and data sources you used, and your conclusions.
  • Methods: How you did it. The what, when, how of your approach. More detail on your method, including info on your data sources, and controls used to eliminate or reduce problems.
  • Conclusions: What you learned.
  • Discussion: What does it all mean? More important, this is where you should talk about issues with the study: limitations, possible criticism, places where important things have not been figured out.
  • References: This is where you acknowledge the work of others you used to get your study done. Important from a social perspective; all but meaningless from a scientific perspective. I.e., the evidence does the talking with no regard to the prestige of other studies. But its bad form to fail to acknowledge the giants upon whose shoulders you’re standing.

Something like that. Most of the time, my reading stops with the Abstract: from such a summary, you can generally easily tell if the study says what the press are saying it says, and, often, tell whether the researcher has a clue about what he is doing. It is from reading abstracts that I propose my general rule: if you heard about it on the news, it’s wrong, by which I mean that either the reporter misrepresented what the study was claiming, the research itself was hopelessly flawed, or both. The first, that the reporter didn’t understand it, is almost always true.

Here’s a description of a proper scientific study format I just found with 30 seconds of searching:

Title–subject and what aspect of the subject was studied.

Abstract–summary of paper: The main reason for the study, the primary results, the main conclusions

Introduction–why the study was undertaken

Methods and Materials–how the study was undertaken

Results–what was found

Discussion–why these results could be significant (what the reasons might be for the patterns found or not found)

from Colorado State University

So, more or less, what I just described. Problem: an indespensible part is not expressed clearly: “what the reasons might be for the patterns found or not found.” Weak. What you want, instead, is – back to the well – what Feynman describes:

…if you’re doing an experiment, you should report everything that you think might make it invalid—not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked—to make sure the other fellow can tell they have been eliminated.

Details that could throw doubt on your interpretation must be given, if you know them.  You must do the best you can—if you know anything at all wrong, or possibly wrong—to explain it. 

From that 1974 CalTech commencement address again.

With that in mind, let’s look at the Discussion section of the paper referenced yesterday, Excess Deaths Associated with COVID-19, by Age and Race and Ethnicity — United States, January 26–October 3, 2020, where, ideally, you’d want to ‘discuss’ the issues Feynman brings out. Here is the relevant section:

The findings in this report are subject to at least five limitations. First, the weighting of provisional NVSS mortality data might not fully account for reporting lags, particularly in recent weeks. Estimated numbers of deaths in the most recent weeks are likely underestimated and will increase as more data become available. Second, there is uncertainty associated with the models used to generate the expected numbers of deaths in a given week. A range of values for excess death estimates is provided elsewhere (7), but these ranges might not reflect all of the sources of uncertainty, such as the completeness of provisional data. Third, different methods or models for estimating the expected numbers of deaths might lead to different results. Estimates of the number or percentage of deaths above average levels by race/ethnicity and age reported here might not sum to the total numbers of excess deaths reported elsewhere, which might have been estimated using different methodologies. Fourth, using the average numbers of deaths from past years might underestimate the total expected numbers because of population growth or aging, or because of increasing trends in certain causes such as drug overdose mortality. Finally, estimates of excess deaths attributed to COVID-19 might underestimate the actual number directly attributable to COVID-19, because deaths from other causes might represent misclassified COVID-19–related deaths or deaths indirectly caused by the pandemic. Specifically, deaths from circulatory diseases, Alzheimer disease and dementia, and respiratory diseases have increased in 2020 relative to past years (7), and it is unclear to what extent these represent misclassified COVID-19 deaths or deaths indirectly related to the pandemic (e.g., because of disruptions in health care access or utilization).

Despite these limitations, however, this report demonstrates important trends and demographic patterns in excess deaths that occurred during the COVID-19 pandemic. These results provide more information about deaths during the COVID-19 pandemic and inform public health messaging and mitigation efforts focused on the prevention of infection and mortality directly or indirectly associated with the COVID-19 pandemic and the elimination of health inequities. CDC continues to recommend the use of masks, frequent handwashing, and maintenance of social distancing to prevent COVID-19.†††

(It’s also customary to plead for more money suggest further research is needed at this point in the paper. )

Let’s give them credit: the authors did mention (but not elaborate on the degree to which this is important) that the method used to estimate expected deaths will affect the results (point 3) and that (point 4) their method – averaging weekly deaths over the last 5 years – “might (might?) underestimate the total expected numbers” due to factors that don’t include ignoring the obvious trend over that same 5 year period. But hey, they did in fact mention possible issues in the process of dismissing them without argument.

Notice anything missing here? Hint: the authors lead and conclude with “limitations” that might cause their numbers to be understated, but stop there. Any discussion of how the numbers might be *overstating* the issue, and to what degree? As if no one anywhere has pointed out any possible lacks or flaws in the methods or data sources that might lean against the conclusions? Did they ask anybody? Like, other people in their own organization? These people, for example? Here, from the CDC’s website, is a study titled: Mental Health, Substance Use, and Suicidal Ideation During the COVID-19 Pandemic — United States, June 24–30, 2020. Just the first thing that came up when I searched “CDC stress study.” Here’s the Summary:

What is already known about this topic?

Communities have faced mental health challenges related to COVID-19–associated morbidity, mortality, and mitigation activities.

What is added by this report?

During June 24–30, 2020, U.S. adults reported considerably elevated adverse mental health conditions associated with COVID-19. Younger adults, racial/ethnic minorities, essential workers, and unpaid adult caregivers reported having experienced disproportionately worse mental health outcomes, increased substance use, and elevated suicidal ideation.

What are the implications for public health practice?

The public health response to the COVID-19 pandemic should increase intervention and prevention efforts to address associated mental health conditions. Community-level efforts, including health communication strategies, should prioritize young adults, racial/ethnic minorities, essential workers, and unpaid adult caregivers.

Ignoring the egregious problems with this second study except to say it’s a classic “I’ve got a hammer – massive intervention in people’s lives in the name of health – so every problem is a nail requiring massive intervention in people’s lives” we simply note: the CDC has for a long time been aware that stress is a health problem which sometimes kills people.

Yet the excess morbidity study doesn’t see fit to mention the possibility that, in addition to COVID, stress is also killing people, nor take the next obvious step and ask: what sort of people is stress likely to kill? And then map that to the data. Nope.

Conclusion: Lysenkoism. The authors can’t even acknowledge the possibility lockdowns (and masks!) might just cause some deaths someplace? That massive increases in stress levels brought about by the mitigations efforts themselves and incessant fear mongering might have some effect? The desired result was known, the study was written to conform to that result, and factors that might steer away from the desired result are simply ignored.

Discussion: I could be wrong. Tell me how.

Author: Joseph Moore

Enough with the smarty-pants Dante quote. Just some opinionated blogger dude.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: