Written by: Stephen Hsu
Primary Source: Information Processing, 11/06/2018.
In the survey reported below, about 1 in 4 biostatisticians were asked to commit scientific fraud. I don’t know whether this bad behavior was more prevalent in industry as opposed to academia, but I am not surprised by the results.
I do not accept the claim that researchers in data-driven areas can be ignorant of statistics. It is common practice to outsource statistical analysis to people like the “consulting biostatisticians” surveyed below. But scientists who do not understand statistics will not be effective in planning future research, nor in understanding the implications of results in their own field. See the candidate gene and missing heritability nonsense the field of genetics has been subject to for the last decade.
I cannot count the number of times, in talking to a scientist with a weak quantitative background, that I have performed — to their amazement — a quick back of the envelope analysis of a statistical design or new results. This kind of quick estimate is essential to understand whether the results in question should be trusted, or whether a prospective experiment is worth doing. The fact that they cannot understand my simple calculation means that they literally do not understand how inference in their own field should be performed.
(Annals of Internal Medicine 554-558. Published: 16-Oct-2018. DOI: 10.7326/M18-1230)
Of 522 consulting biostatisticians contacted, 390 provided sufficient responses: a completion rate of 74.7%. The 4 most frequently reported inappropriate requests rated as “most severe” by at least 20% of the respondents were, in order of frequency, removing or altering some data records to better support the research hypothesis; interpreting the statistical findings on the basis of expectation, not actual results; not reporting the presence of key missing data that might bias the results; and ignoring violations of assumptions that would change results from positive to negative. These requests were reported most often by younger biostatisticians.
This kind of behavior is consistent with the generally low rate of replication for results in biomedical science, even those published in top journals:
What is medicine’s 5 sigma? (Editorial in the Lancet)… much of the [BIOMEDICAL] scientific literature, perhaps half, may simply be untrue. Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, [BIOMEDICAL] science has taken a turn towards darkness. As one participant put it, “poor methods get results”. The Academy of Medical Sciences, Medical Research Council, and Biotechnology and Biological Sciences Research Council have now put their reputational weight behind an investigation into these questionable research practices. The apparent endemicity of bad research behaviour is alarming. In their quest for telling a compelling story, scientists too often sculpt data to fit their preferred theory of the world. …