*Kai Wang is a postdoctoral fellow at the Center for Applied Genomics, Children's Hospital of Philadelphia and an author on numerous genome-wide association studies. He left this lengthy comment as a response to my recent post on *this comment by McClellan and King in *Cell, and I felt it warranted promotion to a full post (with Kai's permission). For more discussion of the M&K review see also two recent posts by Steve Turner at Getting Genetics Done, and an excellent post from p-ter at Gene Expression. **
A similar version of this comment is also published at Getting Genetics Done. I've done some mild editing here for clarity, added some sub-headings and links, and deleted two statements that could be regarded as ad hominem arguments. None of these changes affect the substance of Kai's argument.
*Citation: McClellan, J., & King, M. (2010). Genetic Heterogeneity in Human Disease Cell, 141 (2), 210-217 DOI: 10.1016/j.cell.2010.03.032
Quite a few people mentioned to me about the McClellan et al paper and the related Internet posts about it (including those in Genetic Future). Discussion on at least three diseases in the paper (hearing loss, SCA and autism) cited some of my published papers, and I therefore decided to post my comments on the Internet, to set the records straight. Although I whole-heartedly agree that rare variants play a substantial role in human diseases, I also think that the section on GWAS reflects misunderstandings of the concept of GWAS, ignorance of standard practices in GWAS, misinterpretation of published primary research data, and as a result, is misinforming the general readership of Cell. These issues need to be rectified for the good of the scientific community, and for the healthy development of methodology and practice of human genetic research. For impatient readers, these are the major points:
- GWAS interrogate disease loci through linkage disequilibrium, so the lack of known biological function on GWAS SNPs does not justify the attack against GWAS by McClellan et al;
- Methods for adjusting population stratification are well established in the GWAS community; it is not a valid argument to explain most GWAS signals (with odds ratio less than 2) by stratification, especially if family-based study design is used (including the autism GWAS);
- McClellan et al used rs4307059 (from autism GWAS) as a "particularly dramatic" example of stratification because its frequency varies across Europe and it is monoallelic in Africa, which is not scientifically and statistically justified. In fact, it is the nature of SNPs to have differing allele frequencies across populations, and almost half of the SNPs in Illumina array have higher Fst population divergence values than rs4307059 (that is, half the SNPs are more variable than rs4307059 across human populations).
Below I elaborate these points more specifically for interested readers.
1. Lack of known biological function doesn't invalidate GWAS
McClellan et al use the fact that most detected SNPs in GWAS are from intergenic regions to question the utility and the reliability of GWAS, and raised a serious question: "How did genome-wide association studies come to be populated by risk variants with no known function?".
In fact, GWAS do not attempt to identify functional SNPs, but rather identify approximate location of loci that harbor disease variants. This is possible due to the extensive linkage disequilibrium (LD) between segregating sites in a given human population. Most SNPs in SNP arrays have unknown biological function, only because most SNPs in HapMap are outside of coding regions and because manufacturers of SNP arrays usually do not select SNPs by known function. Unfortunately, this fact may not be well known outside of the GWAS community, such as most readers of the journal Cell. McClellan and King did mention LD but they did not recognize that GWAS do not attempt to interrogate causal variants in the first place. More interestingly, they discussed the SCA GWAS and hearing loss GWAS that I published; the hits in both GWAS are actually outside but close to the causal gene (HBB and GJB2), yet they tag exonic variants in the causal gene, representing two particularly vivid and classic examples on how GWAS work through LD. It is unclear how McClellan and King can discuss these two examples extensively by ignoring the basic facts that both non-coding hits indeed faithfully tag the causal variants in causal genes through the magic of LD. For readers not familiar with GWAS, I need to also emphasize that GWAS variants were typically referred to as "risk variants" only because of convention of published literature, not because they are the actual functional variants that confer risk. Unlike what some readers may think based on McClellan and King, 100% of Africans carry a risk allele does not suggest that all subjects of African descent are predisposed to risk; it merely suggest that LD patterns in European and African populations at a locus are different. One cannot interpret GWAS results without acknowledging these basic facts. 2. Population stratification is not a plausible explanation for most GWAS hits
McClellan and King erroneously attributed many published GWAS hits as caused by population stratification, as if GWAS used similar strategies as candidate gene association studies. Without any scientific support, they even claimed that "an odds ratio of 3.0, or even of 2.0 depending on population allele frequencies" would be robust to be interrogated in GWAS. In fact, the beauty of whole-genome SNP data is that inflation of test statistics due to population substructure can be identified and adjusted. Populations do not differ in one or two SNPs; they differ in many loci and that explains why whole-genome data helps identify stratification, and several recent studies already show how extremely fine-scale sub-populations in Europe can be separated by whole-genome data. The GWAS community has established methods to deal with population stratification and these methods are fairly effective for common variants without any controversy in the field. There are certainly some challenges on analyzing rare variants or recently admixed populations, and these are research topics that we are actively studying. McClellan and King failed to inform readers of the standard practices of genomic control, EigenStrat, multi-dimensional scaling or many dozens of other approaches for addressing stratification, which are now commonly used in case/control GWAS. Furthermore, family-based study design in GWAS has the advantage of protecting against stratification, which should be emphasized to readers. For example, McClellan and King attack our autism paper as a false positive due to population stratification, but our paper is largely driven and replicated by family-based cohorts, not case/control cohorts. Therefore, their general claim lacks scientific support, ignores massive amounts of work by the statistical genetics community in developing stratification adjustment methods, and reflects unrealististic speculation and unfamiliarity with standard GWAS practices. 3. The provided example of a false positive hit is exaggerated
McClellan and King mistakenly treat GWAS hits as "false positive" if their allele frequencies vary across European populations or HapMap populations. The allele frequency variation for ANY (I mean it, ANY!) SNP across populations is not something that should be surprising to researchers with substantial GWAS knowledge. Of course, it is the very nature of ANY SNP to have variable allele frequencies across human populations, so that Asians, Caucasians and Africans differ from each other. It appears that McClellan and King are surprised because they believe that most SNPs should have similar allele frequencies in all populations. Specifically, they described the SNP rs4307059, reported by us to be associated with autism, as a "particularly dramatic example of the perils of cryptic population stratification". Their reasoning on "stratification" is that the frequency of the proposed risk variant varies from 0.21 to 0.77 across European populations and that it is monomorphic in African populations. In reality, the allele frequency of rs4307059 is fairly consistent among large cohorts of European Americans (MAF=39%), WTCCC (MAF=38%), POPRES British (MAF=39%), POPRES Spanish (MAF=37%). In HGDP data, I did confirm that the allele frequency differ in Tuscany (MAF=75% in 7 samples, yes you read it right, SEVEN) and Orcadian (MAF=25% in 15 samples), but readers should be aware that frequency estimate depends on the sample size (seriously, mathematically, what would you expect from 7 or 15 samples, and how much do these two populations contribute to genes in European Americans?). [Update:* Kai adds: "I realized that the Toscani population is actually part of HapMap3, so the allele frequency can be inferred from there (n=102, still small but good enough). I assumed that "Toscani in Italia" in HapMap is similar to "Tuscan Italy" in HGDP. The MAF (C allele) is indeed 41% in HapMap sample (202 chromosomes, HapMap 3 release 3) (warning: huge file), which is fairly similar to European Americans and not even remotely close to the 77% number inferred from n=7 by McClellan et al."*]Furthermore, assuming that allele frequency measures are indeed accurate, if we want to do science rigorously, we need appropriate control experiments, so let us compare this SNP with others in the same genomic region: there is no any evidence of increased population differentiation for this particular SNP in 2Mb genomic region across human populations (chr5:25500000..26499999 in the HGDP browser). Finally, if we examine the SNP in the context of the whole genome, based on HGDP browser, we can see that 44% of SNPs (-log(0.44)/log(10)=0.35 for rs4307059 in the "Fst" track, raw data) in the Illumina array have a more extreme Fst values than this SNP, so about half of the SNPs have stronger population divergence than this SNP. One cannot just take a random SNP from the MIDDLE of a ranked list and claims it as "particularly striking" example of population stratification. Any such claim needs to be made in the context of comparative analysis with other SNPs, otherwise it is not a scientifically rigorous practice and serves a purpose solely to misinform readers outside of the field.[DM: for a graphic il lustration of this point, see this post from Steven Turner.]
4. Misinterpretation of the autism GWAS
McClellan and King's interpretation of the autism locus is wrong. McClellan and King utilized this as an example of "false positive", without any valid scientific evidence (differences of allele frequencies in Tuscany and Africans does NOT suggests false positive in European Americans!). Another study (Weisset al.) cited by McClellan and King was not able to garner evidence for this SNP, but the study has very small non-overlapping sample size and therefore little power to "replicate" loci with moderate effect sizes. Furthermore, Weiss et al. used a family-based association test (TDT test), so there is no comparison of case/control allele frequencies as mentioned by McClellan and King. Due to power issues and sample comparability issues, Weiss and Arking (both are nice people who I know) faithfully described their research results in the paper without comments, yet McClellan and King mistakenly interpolate these primary results without scientific support and attach a "false positive" label that completely misled the scientific community. On the other hand, McClellan and King failed to mention another companion study identifying this same locus purely by family-based cohorts. In addition, a paper in press shows that the SNP also functions as a quantitative trait locus for autistic traits in ~8000 children in a single UK city born at the same year, which pretty much blows away any concern on stratification in case/control studies. For me, these are compelling evidence that population stratification does not explain the signal, though I think that functional studies are certainly necessary to identify causal variants and to study their roles. In summary, their criticism on the autism locus lacks any rigorous scientific support whatsoever. 5. Misinterpretation of hearing loss and sickle cell anemia GWAS
McClellan and King mistakenly interpreted the hearing loss GWAS and sickle-cell anemia GWAS that we published in PLoS Biology. Interestingly, they even have a somewhat opposite interpretation of the primary research data presented in our paper: our original purpose is to demonstrate how rare variants may contribute to human diseases (and may show up in GWAS through LD with common SNPs in Illumina arrays), so our paper should really be interpreted as supporting the arguments for studying rare variants in their paper. For readers, I need to clarify that sickle-cell anemia is a classic example of heterozygosity advantage in any genetic textbook, and our study demonstrates how rare alleles under balancing selection can show up in GWAS. On the other hand, hearing loss is known to be caused by many genes but the major cause is GJB2 mutation, so the GWAS demonstrates that moderately rare alleles (MAF=1.2%) can be picked up by GWAS without balancing selection. I simply do not understand what they are trying to get by "had inherited hearing loss been investigated in a region where it is more common (e.g., in the Middle East), ", as any GWAS should be focused on a specific ethnicity group, and I cannot just combine Caucasians with Middle East people together and of course this will dilute the signal in GWAS. Why would I even bother to apply GWAS "in heterogeneous populations of common diseases" at all, as suggested by McClellan and King, when the very power of GWAS comes from examination of LD? I do not understand how they can take the exactly same results and re-interpret the data and get a drastically different interpretation from the data. Conclusions I will send a shortened version of my comments to Cell. I cannot predict what will be the outcome of this appeal, but I would appreciate comments from readers of this post and I will try to address them. I wonder what is the appropriate balance between academic freedom and scientific responsibility for researchers to make comments on subjects outside of their expertise in the absence of rigorous scientific support; I also wonder what is the appropriate standard for basic fact checking for journals to publish especially strong claims, even for non-research articles (essays/commentary/review), and what is the appropriate response from well-respected journals to recognize and rectify these mistakes. Let us wait and see.