Imagine…

Written by: C. Titus Brown

Primary Source: Living in an Ivory Basement

Links, software, thoughts — all solicited! Send ’em to me, t@idyll.org.

Imagine… a rolling 48 hour hackathon, internationally teleconferenced, on reproducing analyses in preprints and papers. Each room of contributors could hack on things collaboratively while awake, then pass it on to others in overlapping timezones and go to sleep. The next day, they can wake up to a set of pull requests, documentation, tests, and learned information, together with a friendly set of about-to-sleep people who can tell them what’s going on. Work could be done on cloud machines in a standardized environment, with literate computing and version control. (Hat tip to Greg Wilson, who introduced me to this concept via Random Hacks of Kindness.)

Imagine… open, remixable education materials on building reproducible computational analyses, with lesson plans, homeworks, and workshop exercises under CC0. Video lectures would be posted, together with the “lecture script” used to produce them. (Hat tip to Jake Vanderplas.)

Imagine… a monthly journal club on reproducibility and reproducibility engineering, done as with the IPython lab meetings: via Google Hangout broadcast to YouTube. We could pick apart papers that have been implemented reproducibly, run them, understand design tradeoffs, and invite the author to discuss their choices and talk about what they would do now. (Hat tip to Randy LeVeque, Bill Howe, and Steven Roberts, with whom I conversed about this in October; and Olga Botvinnik and Gabe Pratt, who developed the idea further at PyCon.)

Imagine… a declarative metadata standard that you can use to tell a Linux VM how to download your data, execute your computational analysis, and spin up an interface to a literate computing environment like IPython Notebook with the analysis preloaded. Then we can provide buttons on scientific papers that say “run this analysis on Rackspace! or Amazon! Estimated cost: $25.” (Hat tip to Fernando Perez and Jake Vanderplas, who helped me think through this.)

Imagine… automated integration tests for papers, where you provide the metadata to run your analysis (see one paragraph above) while you’re working on your paper and a service automatically pulls down your analysis source and data, runs it, and generates your figures for you to check. Then when the paper is ready to submit, the journal takes your metadata format and verifies it themselves, and passes it on to reviewers with a little “reproducible!” tick mark. (Hat tip to Fernando Perez and Jake Vanderplas, who helped me think through this.)

Also see a previous list of ideas: http://ivory.idyll.org/blog/w4s-tech-wanted.html

–titus

The following two tabs change content below.
C. Titus Brown
C. Titus Brown is an assistant professor in the Department of Computer Science and Engineering and the Department of Microbiology and Molecular Genetics. He earned his PhD ('06) in developmental molecular biology from the California Institute of Technology. Brown is director of the laboratory for Genomics, Evolution, and Development (GED) at Michigan State University. He is a member of the Python Software Foundation and an active contributor to the open source software community. His research interests include computational biology, bioinformatics, open source software development, and software engineering.