How New Sequencing Technologies Are Unravelling Rare Genetic Diseases

Cross-posted (in a slimmed-down form) on the Wellcome Trust blog. Rare diseases matter There are thousands of rare genetic diseases, ranging from the widely-known (such as Huntington’s disease, an adult-onset brain disorder) to the obscure (such as fibrodysplasia ossificans progressiva, a disease affecting less than one in a million people, in which the patient’s muscles […]

All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links.

Cross-posted (in a slimmed-down form) on the Wellcome Trust blog.

Rare diseases matter

A causal needle in the genetic haystackThere are thousands of rare genetic diseases, ranging from the widely-known (such as Huntington’s disease, an adult-onset brain disorder) to the obscure (such as fibrodysplasia ossificans progressiva, a disease affecting less than one in a million people, in which the patient’s muscles are slowly replaced with bone). While individually rare, these diseases collectively create a tremendous burden of suffering: childhood-onset single-gene disorders affect nearly four children in every thousand live births, and are responsible for more than 10% of paediatric hospital admissions.

Previously, finding the mutations that cause these rare diseases was a lengthy process, relying on a technique called linkage analysis. Firstly, DNA samples were collected from large families affected by the disease. Secondly, these samples were examined at thousands of highly variable sites across the genome, to look for markers that were always found in patients but not in their healthy family members. Finally, researchers needed to comb through the dozens or hundreds of genes close to these 'linked' markers to look for mutations that might be disease-causing.

This whole process takes time, money, and more than a little luck. In addition, for certain classes of genetic disease - those that have only been found in small families, or are caused by genetic changes that arise spontaneously in patients rather than being inherited from either parent - linkage analysis is impossible. That means that while this technique has successfully deciphered the genetic basis of thousands of rare genetic diseases, many remain unexplained.

Finding the underlying mutations for these diseases is of far more than just academic interest. For patients and their families a full genetic diagnosis can provide a sense of closure after years or decades of enduring medical tests with no clear answers. In some cases identifying the underlying gene can provide important clues about the underlying mechanism of the disease, and perhaps even point to potential therapies. However, we also shouldn't underestimate the raw scientific value of these studies: every gene we link to a rare Mendelian disease increases our understanding of the ways genes work together to build a human being.

Towards a solution

Over the last few years, rapid advances in DNA sequencing technology have begun to provide a cost-effective alternative to linkage analysis. Rather than first looking for the regions of the genome that are linked to the disease, cheap sequencing offers a simple, brute force solution: look at all of the genes in a patient’s genome, see which ones contain a likely damaging mutation, and then investigate those genes to see which is most likely to cause the patient’s disease.

New sequencing technologies have resulted in a dramatic drop in the cost of reading a person’s DNA. However, it’s still expensive to sequence their entire genome - all six billion letters of it will currently set you back somewhere in the order of US$20,000 (£12,200). Fortunately, most rare diseases (around 80%, by some estimates) are caused by mutations found in a relatively small fraction of the genome: the pieces that code for proteins, known collectively as the exome.

These pieces of protein-coding sequence are scattered across the genome, but only make up less than 2% of its total length. Using an approach called sequence capture - in which tiny DNA probes are used to pull out the protein-coding regions in a patient’s DNA, letting the remainder wash away - it is possible to extract and sequence only these regions. The small size of an exome means it can be sequenced from a patient for just a few thousand pounds - in many cases, substantially less than the cost of a series of single-gene tests.

Over the last two years exome sequencing has been applied to hundreds of patients suffering from undiagnosed genetic diseases. The first public success story, reported in Nature in August 2009, showed that exome sequencing in a four-member family could be used to re-discover a previously known mutation causing a disease called Freeman-Sheldon syndrome. Later that year the same group used the technique to pin down a previously unknown mutation that caused the rare developmental disease Miller syndrome. Since then the technique has been responsible for a string of successful discoveries: diseases such as Kabuki syndrome, Fowler syndrome and Schinzel-Giedeon syndrome, for instance, have all been pinned down.

In some cases the information revealed by exome sequencing resulted in crucial changes to a patient’s clinical care: in one example, summarised by Luke Jostins at Genomes Unzipped, exome sequencing of a young boy with severe bowel inflammation revealed a defect in an important immune gene, suggesting that the boy’s condition could be treated with a bone marrow transplant. Within six weeks of the operation the patient was able to eat solid food, and five months afterwards the disease had not recurred.

However, such successes have not come without challenges. Identifying the mutations that cause each disease has been complicated by the fact that all of us carry many apparently 'broken genes' that don’t actually cause disease; filtering these out has often required looking for mutations found in multiple patients and not seen in their healthy family members. In addition, exome sequencing is expected to miss a fraction of disease-causing mutations: for instance, all of those that fall outside protein-coding regions, or that are present in regions that aren’t well-captured by current technologies.

What fraction of diseases will exome sequencing solve?

The string of success stories in high-profile journals is promising, but hasn’t enabled us to judge what fraction of diseases the technique simply doesn’t work in (in most cases, failures don’t make it into the academic literature). However, at a recent meeting I attended in Hinxton, UK, Dutch geneticist Han Brunner provided some hard numbers based on his group’s analysis of over 200 patient exomes representing 30 different diseases: of these 30 diseases, 15 resulted in the discovery of a novel disease-causing gene, 5 turned out to be caused by mutations in previously discovered genes, and the remaining 10 are yet to give up their secrets.

There are some caveats to bear in mind here. Firstly, Brunner did note at the meeting that these 30 diseases were the ones where he regarded the analysis as "completed" - and diseases where the exome approach has been more difficult are more likely to still be sitting in the "uncompleted" stack. Secondly, it seems likely that the first wave of exome sequencing targets will represent the lower-hanging fruit: diseases where there is a clear phenotype shared between multiple patients, and with larger families available for analysis. As we venture down into the more complex cases, where there are only a few patients available or where the disease definition is messier (making it more likely that several different genes will contain causal mutations) the success rate will inevitably take a hit.

There are some obvious reasons why exome sequencing can fail. In some cases the disease-causing mutation won't be in the sequenced regions, either because it is not protein-coding, because it's in a region that's badly captured by current technologies, or because it's simply been left of the exome capture chip. Brunner provided a cautionary example of the latter situation: their attempts to find the mutation underlying Kabuki syndrome were unsuccessful because - as it turns out - the underlying gene MLL2 wasn't present on their early capture arrays (it was present on other arrays, resulting in a Nature Genetics paper for another group). These cases will shrink as both capture and sequencing technologies improve, and as capture arrays begin to include more genes as well as functional non-protein-coding elements.

Diseases with more complex genetic causes will also remain problematic. If mutations in multiple genes cause the same disease, either more sophisticated statistical approaches (and more patients) will be required to dissect them, or clinicians will need to come up with ways of teasing apart distinct syndromes that show very similar clinical symptoms. For many diseases it will no doubt prove impossible to arrive at a single "smoking gun" gene - but at the very least, the exome approach should provide a set of candidate genes that can be tackled by clinical and functional studies.

So, even with these caveats, Brunner's numbers suggest that applying exome sequencing to as many rare disease patients as possible will uncover the genetic basis for a substantial fraction (hopefully the majority) of them, yielding a rich harvest of new disease genes in the process. That means that this technique will provide a long-awaited answer to many rare disease patients while also improving our understanding of the function of human genes.

As large-scale exome sequencing projects continue to scale up around the world, hundreds of rare diseases will be unravelled within the space of the next one to two years. Never before has a single technique promised to reveal so much about genetic disease in such a short space of time. For geneticists, and for rare disease patients, these are exciting times indeed.