Posted the Chick Genome Improvement Grant

Written by: C. Titus Brown

Primary Source: Living in an Ivory Basemen

I’ve just posted the narrative for a recently funded USDA grant on improving the quality of the chick genome assembly on the lab’s research page. The issues are laid out in detail in the grant, but, basically, the question is: how can we improve the quality of the assembly? The answer, we think, is to pursue a range of strategies that include additional sequencing to get at the microchromosomes, as well as improved assembly merging and scaffolding tools capable of dealing with a range of sequencing data types.

For this genome in particular, we now have Sanger, 454, Illumina, PacBio, and Moleculo data. How do you cross-evaluate that data, much less combine it all? Interesting questions!

We do know that reasonably sizeable chunks of the chick genome are missing or unscaffolded, in part because they’re hard to sequence and in part because they’re hard to assemble. The PacBio data is already leading to significant improvement in galGal 4, and now we’re trying to figure out how to make use of the Moleculo data, too.

One particularly interesting approach I’m working on is to release some or all of the data so that assembler authors can experiment with all of this data. In particular, it should be possible to release a small subset of the data for whatever is not represented in the current assembly; this certainly includes a bunch of microchromosomes. I’ll keep you posted.


p.s. Remember when I didn’t work on euk genome assembly? Yeah, me too. I can already tell I’m going to long for the days of “simple” metagenome and transcriptome assembly work ;)

The following two tabs change content below.
C. Titus Brown
C. Titus Brown is an assistant professor in the Department of Computer Science and Engineering and the Department of Microbiology and Molecular Genetics. He earned his PhD ('06) in developmental molecular biology from the California Institute of Technology. Brown is director of the laboratory for Genomics, Evolution, and Development (GED) at Michigan State University. He is a member of the Python Software Foundation and an active contributor to the open source software community. His research interests include computational biology, bioinformatics, open source software development, and software engineering.