A Saturday morning conversation about publishing inconclusions

Written by: C. Titus Brown

Primary Source: Living in an Ivory Basement

Here’s an excerpt from an e-mail to a student whose committee I’m on; they were asking me about a comment their advisor had made that they shouldn’t put a result in a paper because “It’ll confuse the reviewer.”

One thing to keep in mind is that communicating the results _is_ important. “It’ll confuse the reviewer” is not always shorthand for “well, it’s true but we can’t say it because it’s too confusing”, which is how I think you’re interpreting your advisor’s comment; it is also shorthand for “well, we don’t really have the depth of data/information to understand why the result is this way, so there’s something more complicated going on, but we can’t talk about it because we don’t understand it ourselves.” These can lead to the most productive bits of a project, if followed up (but it takes a lot of time to do so…)

Their response:

I’m curious, however, if we don’t understand what’s going on ourselves, shouldn’t that be all the more reason to publish something? Because then other people with more knowledge can read it and they may know what’s going on? Or at least help us?

And my response:

Well, that’s definitely not how it works. Let me see if I can think of some reasons why…

First, it’s much easier to get confused than it is to fight your way out of confusion – and you usually learn something in the process, obviously. So you have a lot more people who are confused than have learned something, and the latter is considered more interesting and worthy of communication than the former.

Second, writing papers is a lot of work and takes a lot of time. If you’re confused (and hence probably wrong about any conclusions) it’s probably not worth the effort…

Third, reading papers is also a lot of work! Most papers are never read seriously by anyone other than the reviewers. So doubling or tripling the number of papers to include confusing or inconclusive data would probably not be helpful.

Fourth, most conclusions require multiple lines of evidence, most which are often not developed until you have at least some solid hypotheses about why you’re seeing what you’re seeing from one line of evidence. So a lot of those pubs would be wrong.

Fifth, failure is frequently underdetermined — that is, we may have wrong or incorrect results and not know or know why. It’s rarely worth chasing down exactly why something didn’t work unless it doesn’t work reproducibly and it’s critical and important to your results.

That having been said, there’s a movement to make publishing data easier and bring data citations into standard practice, which I think is partly targeted at the same problem you’re asking about.

What am I missing?

thanks, –titus

The following two tabs change content below.
C. Titus Brown
C. Titus Brown is an assistant professor in the Department of Computer Science and Engineering and the Department of Microbiology and Molecular Genetics. He earned his PhD ('06) in developmental molecular biology from the California Institute of Technology. Brown is director of the laboratory for Genomics, Evolution, and Development (GED) at Michigan State University. He is a member of the Python Software Foundation and an active contributor to the open source software community. His research interests include computational biology, bioinformatics, open source software development, and software engineering.