Mar 3, 2017 7:03 AM

Adam Rutherford: 'What better way to store data than zipped in DNA files'

Using DNA for information storage has massive advantages. It is a future-proof format: DNA is the stuff of life, and the technology for writing and reading DNA is only going to improve

All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links.

Industry in the 21st century will be defined by our abilities to manipulate, design and invent new tech based on living systems.

Synthetic cells, commoditised genetic circuitry and now DNA itself are being added to the tools drawn from evolution, but remixed and repurposed by design. We celebrated the 60th anniversary in April of Crick and Watson's paper on the iconic structure of that universal molecule of life, but let's not forget that in essence the double helix is a data storage format. Since 1953, we have decoded life's source code, cut and pasted it across species and read entire genomes of dozens of creatures, including ourselves.

We're now eschewing the natural language of DNA altogether and upgrading it into an immense data format. Hard drives require power; magnetic tape degrades after a decade. So archivists are constantly looking at permanent solutions to storing the world's information, of which there is currently something like three zettabytes. In cells, DNA requires power to be copied and read, but in death it's remarkably stable.

A mere 400 years old, the bones of King Richard III were recently identified using his DNA.

Neanderthals joined the genome club in 2010 when their complete DNA was read from 44,000-year-old bones, and the genome of their frequent prey – the woolly mammoth – was extracted from 20,000-year-old hairs bought on eBay. With this permanence in mind, scientists have been thinking how to use DNA simply for data storage. Craig Venter did it with typical bravado in 2010 with his synthetic bacteria Mycoplasma mycoides JCVI-syn1.0, aka Synthia.

That bacterium had several Easter eggs built into its machine-made genome, including two quotations, from James Joyce and Robert Oppenheimer, and an accidental misquotation from Richard Feynman.

Between September 2012 and January this year, DNA storage took its first steps into a new age. First, Harvard's George Church encoded an entire 53,000-word book in DNA. And, at the beginning of 2013, a team led by Ewan Birney from the European Bioinformatics Institute encrypted all 154 Shakespeare sonnets, an HD video of Martin Luther King's "I have a dream" speech, Crick and Watson's 1953 paper, and more.

So far these techniques are only useful for archiving, as it's slow and expensive to write and read. But, along with its durability, using DNA for information storage has two massive advantages. It is a future-proof format: DNA is the stuff of life, and there will never be a time when we won't study it. And because of that, the technology for writing and reading DNA is only going to improve.

How's this for a postmodern idea: there is one science that splurges colossal volumes of data –– genomics. The first-draft human genome in 2001 was culled from a handful of people, and represented the three-billion-letter code of a generic person. But whereas the DNA of all humankind is 99.9 per cent similar, individuals are encrypted in the wealth of the remainder. What has been happening in genomics since has been the sequencing of thousands more individuals, to understand our uniqueness and disease. The result has been a torrent of sequence data. What better way to store it than zipped in DNA files?

Adam Rutherford is a geneticist and writer. His book, Creation(Viking), is out now.

This article was originally published by WIRED UK