My review of “Determining the quality and complexity of NGS…”

Written by: C. Titus Brown

Primary Source: Living in an Ivory Basement

I was a reviewer on Determining the quality and complexity of next-generation sequencing data without a reference genome by Anvar et al., PDF here. Here is the top bit of my review.

One interesting side note – the authors originally named their tool kMer and I complained about it in my review. And they renamed it to kPal! Which is much less confusing.

The authors show that a specific set of low-k k-mer profile analysis tools can identify biases and errors in sequencing samples as well as determine sample distances between metagenomic samples. All of this is done independently of reference genomes/transcriptomes, which is very important.

The paper is well written and quite clear. I found it easy to read and easy to understand. The work is also novel, I believe.

Highlights of the paper for me included a solid discussion of k-mer size selection, a thorough exploration of how to compare various k-mer-based statistics, the excellent quality evaluation bit (Figure 3),

I was a bit surprised by the shift from quality assessment to metagenomic analysis, but there is an underlying continuity in the approach that makes this a reasonable transition. There might be a way to update the text to make this transition easier for the non-bioinformatic reader.

It’s hard to pick out one particularly important result; the two biggest results are (a) k-mer based and reference free quality evaluation works quite well, and (b) k-mer analysis does a great job of grouping metagenome samples. The theory work on transitioning between k-mer sizes is potentially of great technical interest as well.

The following two tabs change content below.
C. Titus Brown
C. Titus Brown is an assistant professor in the Department of Computer Science and Engineering and the Department of Microbiology and Molecular Genetics. He earned his PhD ('06) in developmental molecular biology from the California Institute of Technology. Brown is director of the laboratory for Genomics, Evolution, and Development (GED) at Michigan State University. He is a member of the Python Software Foundation and an active contributor to the open source software community. His research interests include computational biology, bioinformatics, open source software development, and software engineering.