Wednesday, December 17, 2014

Is DNA barcoding dead?

On a recent trip to the Natural History Museum, London, the subject of DNA barcoding came up, and I got the clear impression that people at the NHM thought classical DNA barcoding was pretty much irrelevant, given recent developments in sequencing technology. For example, why sequence just COI when you can use shotgun sequencing to get the whole mitogenome? I was a little taken aback, although this is a view that's getting some traction, e.g. [1,2]. There is also the more radical view that focussing on phylogenetics is itself less useful than, say, "evolutionary gene networks" based on massive sequencing of multiple markers [3].

At the risk of seeming old-fashioned in liking DNA barcoding, I think there's a bigger issue at stake (see also [4]). DNA barcoding isn't simply a case of using a single, short marker to identify animal species. It's the fact that it's a globalised, standardised approach that makes it so powerful. In the wonderful book "A Vast Machine" [5], Paul Edwards talks about "global data" and "making data global". The idea is that not only do we want data that is global in coverage ("global data"), but we want data that can be integrated ("making data global"). In other words, not only do we want data from everywhere in the world, say, we also need an agreed coordinate system (e.g., latitude and longitude) in order to put each data item in a global context. DNA barcoding makes data global by standardising what a barcode is (a given fragment of COI), and what metadata needs to be associated with a sequence to be a barcode (e.g., latitude and longitude) (see, e.g. Guest post: response to "Putting GenBank Data on the Map"). By insisting on this standardisation, we potentially sacrifice the kinds of cool things that can be done with metagenomics, but the tradeoff is that we can do things like put a million barcodes on a map:

Bold
To regard barcoding as dead or outdated we'd need an equivalent effort to make metagenomic sequences of animals global in the same way that DNA barcoding is. Now, it may well be that the economics of sequencing is such that it is just as cheap to shotgun sequence mitogenomes, say, as to extract single markers such as COI. If that's the case, and we can get a standardised suite of markers across all taxa, and we can do this across museum collections (like Hebert et al.'s [6] DNA barcoding "blitz" of 41,650 specimens in a butterfly collection), then I'm all for it. But it's not clear to me that this is the case.

This also leaves aside the issue of standardising other things's much as the metadata. For instance, Dowton et al. [2] state that "recent developments make a barcoding approach that utilizes a single locus outdated" (see Collins and Cruickshank [4] for a response). Dowton et al. make use of data they published earlier [7,8]. Out of curiosity I looked at some of these sequences in GenBank, such as JN964715. This is a COI sequence, in other words, a classical DNA barcode. Unfortunately, it lacks a latitude and longitude. By leaving off latitude and longitude (despite the authors having this information, as it is in the supplemental material for [7]) the authors have missed an opportunity to make their data global.

For me the take home message here is that whether you think DNA barcoding is outdated depends in part what your goal is. Clearly barcoding as a sequencing technology has been superseded by more recent developments. But to dismiss it on those grounds is to miss the bigger picture of what is a stake, namely the chance to have comparable data for millions of samples across the globe.

References

  1. TAYLOR, H. R., & HARRIS, W. E. (2012, February 22). An emergent science on the brink of irrelevance: a review of the past 8 years of DNA barcoding. Molecular Ecology Resources. Wiley-Blackwell. doi:10.1111/j.1755-0998.2012.03119.x
  2. Dowton, M., Meiklejohn, K., Cameron, S. L., & Wallman, J. (2014, March 28). A Preliminary Framework for DNA Barcoding, Incorporating the Multispecies Coalescent. Systematic Biology. Oxford University Press (OUP). doi:10.1093/sysbio/syu028
  3. Bittner, L., Halary, S., Payri, C., Cruaud, C., de Reviers, B., Lopez, P., & Bapteste, E. (2010). Some considerations for analyzing biodiversity using integrative metagenomics and gene networks. Biol Direct. Springer Science + Business Media. doi:10.1186/1745-6150-5-47
  4. Collins, R. A., & Cruickshank, R. H. (2014, August 12). Known Knowns, Known Unknowns, Unknown Unknowns and Unknown Knowns in DNA Barcoding: A Comment on Dowton et al. Systematic Biology. Oxford University Press (OUP). doi:10.1093/sysbio/syu060
  5. Edwards, Paul N. A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming. MIT Press ISBN: 9780262013925
  6. Hebert, P. D. N., deWaard, J. R., Zakharov, E. V., Prosser, S. W. J., Sones, J. E., McKeown, J. T. A., Mantle, B., et al. (2013, July 10). A DNA “Barcode Blitz”: Rapid Digitization and Sequencing of a Natural History Collection. (S.-O. Kolokotronis, Ed.)PLoS ONE. Public Library of Science (PLoS). doi:10.1371/journal.pone.0068535
  7. Meiklejohn, K. A., Wallman, J. F., Pape, T., Cameron, S. L., & Dowton, M. (2013, October). Utility of COI, CAD and morphological data for resolving relationships within the genus Sarcophaga (sensu lato) (Diptera: Sarcophagidae): A preliminary study. Molecular Phylogenetics and Evolution. Elsevier BV. doi:10.1016/j.ympev.2013.04.034
  8. Meiklejohn, K. A., Wallman, J. F., Cameron, S. L., & Dowton, M. (2012). Comprehensive evaluation of DNA barcoding for the molecular species identification of forensically important Australian Sarcophagidae (Diptera). Invertebrate Systematics. CSIRO Publishing. doi:10.1071/is12008