Donate Help Contact The AHA Sign In Home
American Heart Association
Circulation: Cardiovascular Genetics
Search: search_blue_button Advanced Search
Circulation: Cardiovascular Genetics. 2009;2:7-15
Published online before print January 23, 2009, doi: 10.1161/CIRCGENETICS.108.833392
CLINICAL PERSPECTIVE
Free Article
This Article
Free upon publication Free Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
2/1/7    most recent
CIRCGENETICS.108.833392v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Ioannidis, J. P.A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ioannidis, J. P.A.
Related Collections
Right arrow Clinical genetics
Right arrow Risk Factors
Right arrow Genomics
Right arrow Epidemiology
Right arrow Genetics of cardiovascular disease
Right arrowRelated Articles

Original Articles

Prediction of Cardiovascular Disease Outcomes and Established Cardiovascular Risk Factors by Genome-Wide Association Markers

John P.A. Ioannidis, MD

From the Department of Hygiene and Epidemiology, University of Ioannina School of Medicine; Biomedical Research Institute, FORTH, Ioannina, Greece; and Department of Medicine, Center for Genetic Epidemiology and Modeling, Tufts Medical Center, Tufts University School of Medicine, Boston, Mass.

Correspondence to John P.A. Ioannidis, MD, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece. E-mail jioannid{at}cc.uoi.gr

Received November 3, 2008; accepted December 4, 2008.


    Abstract
 Top
 Abstract
 Introduction
 Conclusions
 References
 
Background— Genome-wide association (GWA) platforms have yielded a rapidly increasing number of new genetic markers. The ability of these markers to improve prediction of clinically important outcomes is debated.

Methods and Results— A systematic review was performed of GWA-derived markers associated with cardiovascular outcomes or other phenotypes that represent common established risk factors for cardiovascular outcomes. Sources of information included the National Human Genome Research Institute catalog of published GWA studies, and perusal of the eligible GWA articles, meta-analyses on the respective associations, and articles on the incremental predictive performance of common variants in the GWA era. A total of 95 eligible associations were retrieved from the National Human Genome Research Institute catalogue of published GWA studies as of September 2008. Of those 36 have statistical support of P<10–7. In depth evaluation of the respective articles shows 28 independent associations with such statistical support, pertaining to coronary artery disease, myocardial infarction, atrial fibrillation/flutter, prolongation of QT interval, as well as type 2 diabetes, body mass index, high-density lipoprotein levels, low-density lipoprotein levels, and nicotine dependence. Between-study heterogeneity is not taken into account usually, but it seems common and it would pose a challenge to generalizability across different populations for these markers. Still limited data are available in non-white populations. Effect sizes are small and may be even smaller in subsequent replications and meta-analysis. Population attributable fractions are substantial, given the large frequency of the risk alleles. However, individualized risk measures are typically very small (proportion of variance explained <1% per marker). When used in conjunction with traditional predictors, improvement in overall prediction (eg, area under the curve) or risk reclassification is limited, and subject to methodological caveats.

Conclusions— Despite very promising signals in terms of statistical significance, evidence for improvement in cardiovascular prediction by currently available markers derived from GWA studies is sparse. Clinical use of such markers currently would be premature.

Key Words: genome-wide association studies • prediction • cardiovascular • risk factors


    Introduction
 Top
 Abstract
 Introduction
 Conclusions
 References
 
In the past 2 years, the discovery of genetic associations for common diseases and complex traits has been accelerated by the advent of high-throughput genotyping platforms and by substantial improvements in the available sample size and quality control measures in genetic epidemiology studies.1 Currently, there is a rapidly increasing number of association signals that attain very strong statistical support and get replicated in additional datasets.

Editorial see p 1

Clinical Perspective see p 7

A major challenge is whether these new genetic markers can improve outcome prediction. Many companies are already marketing genome testing for this purpose. However, despite the enthusiasm, the rationale and public health implications of such applications remain debatable. I examine the status of the literature on the predictive potential for common genetic variants that have emerged from genome-wide association (GWA) studies as of September 2008. The review focuses on genetic markers for cardiovascular outcomes and also on genetic markers for established risk factors of cardiovascular outcomes.

Review Methods
I consider here all genetic markers shown to be associated with any cardiovascular outcome, or with common established independent risk factors of cardiovascular disease, including type 2 diabetes mellitus, body mass index, smoking, hypercholesterolemia, and hypertension. I excluded risk factors that are contentious on the magnitude and/or independence of their effects (eg, C-reactive protein, triglycerides, uric acid).

The search for studies and eligible markers was based on the online catalog of GWA studies hosted by the National Human Genome Research Institute (last searched September 20, 2008). Details on the National Human Genome Research Institute catalog appear elsewhere.2 The catalog is regularly updated to include all published studies that have performed genome-wide evaluations for human disease phenotypes and traits and it lists associations with P≤10–5. Information was used here on the study, chromosomal region, potentially implicated gene(s), single nucleotide polymorphism with the strongest statistical support, the frequency of the risk allele, and the respective effect size, and 95% confidence interval.

The current review focuses primarily on those associations that have reached a P<10–7 as analyzed by the primary authors. Genome-wide significance depends also the studied populations and their LD structure as well as the available sample size,3 but the 10–7 threshold is used for convenience. It is a lenient threshold, if one accounts also for the multiplicity of analyses involving different phenotypes in such studies.4 Several associations with less strong support may also have been replicated in subsequent, additional GWA and other studies, but I wanted to focus on a sample of associations with strongest support upfront.

The articles of the associations reaching genome-wide significance were further examined to see whether any of associations in the National Human Genome Research Institute catalog were duplicate (pertaining to same locus and outcome) and whether additional eligible associations with P<10–7 had not been catalogued. For each of the eventually eligible associations with genome-wide significance, I recorded information on the presence and extent of between-study heterogeneity in the genetic effects, whenever this had been assessed. I searched also for identifying meta-analyses and large-scale subsequent collaborative replication studies relevant to these associations. Finally, I searched for articles that have used common genetic variants, including markers derived from GWA platforms, to examine the improvement of predictive ability for the eligible cardiovascular outcomes and related established risk factors. The search was based on screening the citations of the original GWA investigations (Thomson Web of Science).

Statement of Responsibility
The author had full access to and takes full responsibility for the integrity of the data.

Eligible GWA Associations
Of 454 entries in the catalog, 95 pertained to outcomes and traits that would be eligible for this review. Of the 95, 36 (38%) pertained to associations with P<10–7. Some of the 36 associations had duplicate entries because the same markers or markers with very high LD had been found in 2 or more investigations. Excluding 10 such duplicates, 26 independent associations with genome-wide significance remained and I identified 2 more with eligible genome-wide significance after perusing the respective full articles.5–18 The 28 independent associations are shown in Table 1. These associations pertained to cardiovascular outcomes including coronary artery disease or myocardial infarction (n=4), QT interval duration (n=1), and atrial fibrillation/flutter (n=2); and established risk factors, including type 2 diabetes (n=11), high-density lipoprotein (HDL) (n=2), low-density lipoprotein (LDL) (n=5), body mass index (n=2), and nicotine dependence (n=1).


View this table:
[in this window]
[in a new window]

 
Table 1. Eligible Associations Reaching "Genome-Wide Significance" Levels for Cardiovascular Disease and Related Established Risk Factors
 
The other 59 associations that had less strong statistical support included also additional phenotypes, such as blood pressure/hypertension, coronary spasm, electrocardiographic traits, heart rate variability, coronary artery calcification, subclinical atherosclerosis, echocardiographic traits, exercise treadmill testing, endothelial function tests, and heart failure.

In some cases, associations with the same or highly linked polymorphisms had been identified for different correlated phenotypes. For example, FTO was first found to be associated with type 2 diabetes, but the association was inconsistent, whereas the association with body mass index and obesity was consistent across diverse populations. Another polymorphism (rs599839) was identified to be associated with LDL levels and a subsequent analysis of 2 combined GWA studies documented an association with coronary artery disease, which may be in part or in whole a reflection of the LDL effect.

Between-Study Heterogeneity
An important feature for a marker or set of markers that are considered to be used for predictive purposes is to have consistent effects across diverse populations. If the genetic effects are not consistent, then the applicability of the markers may be limited to those populations where the genetic effect is clearly seen. Between-study heterogeneity can be estimated empirically from the available data when the association between a marker and a disease has been measured in several populations. A measure of heterogeneity of the study-specific effects is the I2 metric ("inconsistency"); it takes values from 0% to 100% and it shows the extent of heterogeneity beyond chance.19 Traditional tests for assessing whether the null hypothesis of homogeneity can be rejected (Q or Breslow-Day tests) are usually grossly underpowered to detect heterogeneity with limited numbers of independent populations.

The main results for all the associations reported in Table 1 did not take the potential of heterogeneity into account, ie, when several populations were tested for replication, the results were combined with simple pooling, simple stratification, or an equivalent fixed effects model (eg, Mantel-Haenszel or inverse variance fixed effects). Estimates of I2 were provided in only 8 of the 25 associations where several populations were assessed in the replication phase (Table 2). In 5 of those, the estimated I2 was >0% and in 2 the heterogeneity even reached formal statistical significance. Of the 6 associations where the investigators had evaluated between-study heterogeneity and had examined also random effects models in secondary analyses, 4 of the 6 associations no longer had P<10–7. These were the associations of rs7578597, rs4607103, rs7961581, and rs10923931 with type 2 diabetes. The same has been observed previously in a random effects reanalysis of other GWA-derived associations for type 2 diabetes.20 Of note, 13 of the 28 associations have P>10–10 even ignoring between-study heterogeneity. Moderate between-study heterogeneity would make these associations lose formal "genome-wide significance."


View this table:
[in this window]
[in a new window]

 
Table 2. Investigation of Heterogeneity in the Associations With Genome-Wide Significance (as per Table 1)
 
There is usually large uncertainty in the amount of between-study heterogeneity, ie, the 95% CIs of I2 are large.19 This further highlights the uncertainty about the anticipated predictive performance of a marker in different populations and settings. For sole discovery purposes, ignoring between-study heterogeneity in the calculations minimizes false negatives. However, for a marker to be used for predictive purposes, one has to demonstrate that it performs well across different populations. This can be taken into account only with models that account for between-study heterogeneity. Even better, one has to factor in the uncertainty in between-study heterogeneity, as can be done in a fully Bayesian meta-analysis, to derive a predicted interval for what are likely to be the underlying true effects is diverse populations.21 With these considerations, some 95% predicted intervals of the genetic effects of these associations may fall on both sides on the null, ie, in a non-negligible proportion of populations the genetic effect may be null or in the opposite direction than what has been observed.

Most of the initially reported data on GWA signals is derived from white populations that share largely similar LD structures. Only 1 of the 28 associations was found to be significant in both white and Asian descent populations in the original GWA publication (Table 2). GWA-derived signals are almost always indirect markers in LD with the real culprits, because it is unlikely that the latter are directly hit. Therefore, it may not be uncommon that genetic effects for the tagging markers may be different in populations with different LD structure. Additional replication studies have started appearing for several of these variants on populations of different ancestry.22–25 Although most tend to replicate the proposed associations, differences in the effect sizes and frequency of the implicated alleles may not be uncommon. Such replication studies should be conducted as rigorously as the original GWA studies. Given the very strong probability values of the original findings, researchers, editors, and reviewers may be intimidated against report contradictory results. In the GWA era, such conformity26 would be disastrous for the credibility of replication efforts.

Effect Sizes
The effect sizes for the identified genome-wide markers are modest or small (Table 1), even smaller compared with those discovered in the pre-GWA era. Type 2 diabetes is a typical example. Documented genetic effects in the pre-GWA era represented per allele odds ratios (ORs) of 1.2 to 1.4. Genetic effects that emerged from the combination of data from the first 3 GWA studies15 represent ORs of 1.12 to 1.2. Finally, genetic effects that emerged from the replication of suggestive signals in meta-analysis of several GWA investigations7 represent ORs of 1.05 to 1.15. For lipid levels, identified polymorphisms have effects in the range of 0.05 to 0.16 standard deviations; for nicotine dependence the effect corresponds to 1 cigarette per day; and for body weight the effects corresponds to {approx}0.5 kg.

GWA-discovered effects are even likely to be inflated because of the winner’s curse in underpowered studies.27,28 It is possible that even sizeable studies and consortia are still underpowered for what are apparently subtle effects. The magnitude of the inflation can be appreciated only if we perform many additional replication studies. This has been done for the rs1333049 polymorphism in 9p21.3 that has one of the strongest discovered genetic effects in Table 1 (OR 1.47 in the original GWA publication). A subsequent meta-analysis29 in 12 004 cases and 28 949 controls resulted in an OR of 1.24 with 95% CI of 1.20 to 1.29. Thus, the effect size is likely to be only half compared with what was proposed originally. The only other ORs exceeding 1.33 in Table 1 are the 2 polymorphisms associated with the risk of atrial fibrillation/flutter. Only one of them was found to be associated also in Asian descent populations (with smaller OR 1.42, compared with 1.72 in Europeans). Surprisingly, no additional replication studies on these 2 polymorphisms have been published in the first 15 months after the original discovery published in Nature.

Small effect sizes have important corollaries. One may infer that for diseases with a large heritability, a very large number of such markers may exist, each with small or very small effect size. This inference presumes that heritability estimates are correct and that common variants are indeed implied in explaining a substantial part of the genetic component of these diseases. Second, one would need to conduct extremely large studies and meta-analyses thereof, beyond the current research capacity of existing epidemiological studies, to unearth subtle effects, let alone reach accurate effect estimates. With between-study heterogeneity, the power is further eroded, and above a given threshold of heterogeneity, it even becomes impossible to reach adequate power.30 Some potentially genuine markers may thus remain irreproducible.

Population Attributable Fractions and Measures of Individualized Prediction
The population attributable fraction (PAF) is a popular metric in the epidemiological literature for over 50 years. For a nonconfounded, unadjusted single risk factor with relative risk (RR), PAF is given by: Go


Formula 1

where PR is the prevalence of the risk factor. For a case-control study, the OR approximates the population RR.

There are many misunderstandings and misapplications.31 PAF represents the proportion of the disease/phenotype of interest that would have been avoided if the risk factor could be eliminated from the population. PAF is thus attractive to report and it is even highlighted in the abstracts of articles that want to claim that they have discovered an important risk factor. The abstract of Helgadottir et al14 in Science thus claims that the discovered variant at chromosome 9 that is associated with coronary artery disease has a PAF of 21% and it is even 31% for early-onset cases. The PAF is largely dependent on the frequency of the risk variant. Even with small ORs, common variants will have large PAF estimates. PAF is even higher in populations with higher risk allele frequencies. For example, in the Nature article presenting 2 variants associated with atrial fibrillation/flutter,12 their joint PAF is estimated to be 21% in Europeans and 35% in the Chinese. In fact, in the Chinese, only 1 of the 2 variants is replicated, and the other one has a much smaller OR than in European populations, but the risk allele frequency is very high (61%), 3 times higher than in Europeans.

A common error is to simply add the PAF values across many markers. The PAF of 2 or several markers is not the sum of the PAF of each. Second, even if the, correctly calculated, overall PAF is high, this does not guarantee that we can tell who will get the disease and who will not. If the PAF is 100%, it simply means that eliminating the people who carry any risk marker would eliminate the disease. However, 90% of the population may carry at least 1 risk marker. The disease may occur only in 2% of those, whereas the other 88% are categorized as carrying a risk marker but never develop disease.

In contrast to PAF, the additional penetrance due to a variant may be a more direct measure of the magnitude of the risk. For a dominant model, it can be shown that this is approximated by the formula f=y(OR–1)/(1+y(OR–1), where y is the proportion of people who get the disease in the absence of the genetic variant of interest.32 Typically, we can assume that for small effects, y=PR approximately. Therefore, for a marker with OR=1.15 for a disease with prevalence of 1%, we get f=0.15% only. For a disease with prevalence of 5%, an OR=1.15 corresponds to f=0.75%, which is still quite small. In general, for the GWA signals identified to-date, the f estimates are very small.

Similar inferences are obtained, when one calculates the proportion of variance explained: a very small portion of between-individuals’ variability of risk is explained by the identified genetic variants. Typically each one of them explains <1%, and for prediction of coronary artery disease, type 2 diabetes or lipid levels; currently, we can explain anywhere between <1% and 5% of the variance.

A useful way to show the small contribution of these markers to individualized prediction is through the area under the curve. Candidate predictors that have no predictive ability have area under the curve (AUC)=0.50, whereas perfect prediction has AUC=1.00. In type 2 diabetes, where we have already 18 genetic variants with strong statistical support as of September 2008, cumulatively all these variants can achieve an AUC=0.60, whereas simple knowledge of age, gender, and body mass index can achieve AUC=0.78,33 even without counting family history.

Another way to look at this is the proportion of the sibling relative risk explained: whereas the sibling relative risk is {approx}3.0 for type 2 diabetes, the 18 identified variants account only for a sibling relative risk of {approx}1.07. Overall, these estimates suggest that we capture only the tip of the iceberg in explaining genetic risk for multifactorial diseases.

Incremental Predictive Ability
New candidate predictors would be useful only if they can be shown to offer incremental information against other predictors that are already available and easily measured. For cardiovascular disease, the literature on predictive models is very rich. Several widely validated scores are already available, such as the Framingham Risk Score for coronary artery disease.34 Depending on the population tested, these predictive scores usually already achieve AUC around 0.75. Prediction of other common diseases such as type 2 diabetes based on traditional risk factors also has AUC values in the same range. The question is how much better can we do with genetic markers emerging from GWA investigations.

In the aforementioned evaluation of the 18 type 2 diabetes variants, the AUC improved from 0.78 to 0.80 with the consideration of the 18 markers on top of the age, gender, and body mass index.33 The improvement is highly statistically significant (P=3x10–12), but very modest in absolute terms. The same probably applies to the performance of 3 identified CAD loci: when Samani et al11 added them to Framingham Risk Score and the PROCAM study score for prediction of myocardial infarction, the model fit improved statistically significantly (P<10–10), but the (unstated) AUC improvement is likely to be very small.

The proportion of the variance explained may be somewhat larger for lipid levels, but it is still limited. In the cardiovascular cohort of the Malmo Diet and Cancer Study, after accounting for age, age2, gender, and diabetes status, Kathiresan et al9 found that, in sum, 7 SNPs explained an additional 5.7% of the residual LDL cholesterol variance, 7 SNPs explained an additional 5.2% of the residual HDL cholesterol variance, and 9 SNPs explained an additional 4.5% of the residual triglyceride level variance. This included both polymorphisms known from the candidate gene era as well as GWA-discovered variants. The latter group accounts for {approx}2% of the residual variance for these lipid phenotypes.

In another study,35 Kathiresan et al have shown that a genotype score based on the number of unfavorable alleles for 9-lipid associated polymorphisms (mostly identified from the pre-GWA era), had an independent predictive effect on the risk of myocardial infarction, ischemic stroke, or cardiovascular death. Kathiresan et al used both SNPs associated with HDL and others associated with LDL, whereas Willer et al10 had also found that LDL-associated SNPs were associated with coronary risk, but this was not true for HDL-associated SNPs (unadjusted analyses). In the Kathiresan et al study, after adjusting for age, gender, familial history of myocardial infarction, LDL and HDL cholesterol levels, log triglyceride levels, systolic and diastolic blood pressures, body-mass index, diabetes, smoking, log C-reactive protein, lipid-lowering therapy, and antihypertensive therapy, the genotype score conferred a relative risk of 1.15 (95% CI, 1.07 to 1.24; P<0.001) per copy of unfavorable allele.

This interesting finding should be seen with caution. First, the model used is not a standard risk model, such as Framingham Risk Score and it pertains to a population with mixed characteristics, including extensive use of treatment. In treated populations, it is difficult to model confounding by indication and the coefficients of other correlated variables may also be affected. Second, the performance of the traditional risk factors in this model, as reflected in an overall AUC, does not improve with the addition of the genotype score (0.80 with and without the genotype score). Third, the poor performance of some of the traditional risk factors in this population is noteworthy, eg, in the multivariate model LDL, diastolic blood pressure, body mass index, and C-reactive protein do not reach nominal statistical significance and their independent effects are smaller than what is usually presumed. Fourth, the model probably suffers from considerable colinearity, given the high correlation of many of these factors, thus estimates for individual risk factor effects are tenuous.

Perhaps most importantly, genetic markers are inherently expected to be proximal to other risk factors in the causation pathways. Thus, genetic markers of hypercholesterolemia are expected to regulate cholesterol levels and thus possibly also cardiovascular disease, whereas the reverse is impossible (cholesterol levels affecting genetic variation). One would expect that variants that have the sole action to regulate lipid levels would not improve prediction of cardiovascular disease if lipid levels were included in the prediction model. The fact that they do have an independent effect in the multivariate model makes one suspect that the above caveats are at play, or additionally the lipid levels have been measured with considerable error in the study population. In the presence of large nondifferential measurement error, classic risk factors such as LDL may have their predictive ability diluted. Conversely, genotypes are typically measured with high accuracy and the genetic effects do not shrink from measurement error. If so, then genotyping these variants would be an expensive way of correcting problems with measurement of lipids or other commonplace risk factors that are already in everyday use.

Predictive Reclassification
Another important question in predictive modeling is whether a model including the genetic information can classify individuals in more or less appropriate risk categories and this may influence therapeutic or preventive decision making. This is different from the overall discriminating ability of the prediction as conferred for example by an AUC; conversely, this is an issue of reclassification.36 Reclassification analyses examine how many patients shift into a different risk category based on a predictive model compared with a standard one; how many of these are appropriately reclassified at higher or lower risk than before versus for how many the reclassification is in the wrong direction; and finally, whether there are changes in recommended decision-making based on this new information.

In the study of Kathiresan et al35 discussed earlier, the AUC did not change materially, but the authors noted that among patients who were originally classified in the III intermediate-risk category, 26% were reclassified into a higher or lower risk category in the risk model containing the genotype score, as compared with the model without the genotype score. The net reclassification index that captures the extent of correct movements in predicted risk categories (higher risk for subjects in whom cardiovascular disease subsequently developed and lower risk for subjects free of incident cardiovascular disease) suggested a significant improvement in risk classification (P=0.01). A significant improvement was also seen in the related integrated discrimination index (P=0.02).

These are promising findings, but caveats exist. First, they are derived from the comparison of the performance of models that have all the limitations alluded to in the previous section. Second, only 9% of the study sample was originally classified in the III intermediate risk category. For most patients, reclassification, even if correct, would not have materially affected optimal decision making. More studies are needed to decipher if there is a reproducible window of risk where genotype information from these or also additional markers may improve decision making.

In another study,37 consideration of rs10757274 did not significantly improve the AUC for coronary heart disease versus using conventional risk factors alone (0.64 versus 0.62, P=0.14), but the authors claimed improved reclassification, with 21.9% of the participants reclassified in different risk categories (<5%, 5% to 10%, 10% to 20%, >20%) and with 63% of these reclassifications being correct (P=0.01). The very low observed discriminating performance (AUC=0.62 only) of the selected conventional risk factors in this study is spurious. Moreover, the observed genetic effect for this marker was quite large in that study (OR 1.38 for heterozygotes). Other data suggest that this single marker alone does not improve reclassification38; this is likely to be the typical case for single common variants.39

The final proof would then lie in the demonstration that genotypic information indeed results in better clinical outcomes. Ideally, this would require randomized trials to be performed. Given that the genomic information is still in its infancy, randomized trials should be considered with caution. A "negative" result for clinical outcomes is very likely, but then it is difficult to tell whether this is proof that genotypic information is useless, or simply genotypic information is still in the making. Nevertheless, in the absence of robust randomized evidence, genomic tests for risk prediction should clearly be labeled as exploratory.


    Conclusions
 Top
 Abstract
 Introduction
 Conclusions
 References
 
Cardiovascular genetics is undergoing a rapid transformation and robust findings have emerged,5–18,40 after 2 decades of mostly irreproducible findings.41–43 However, we have a very long road before applying meaningfully this information for predictive purposes at a population level. We are still dealing with markers of risk with subtle effects, and usually we do not know the culprit genetic variation that these markers are mirroring. Between-population heterogeneity is substantial, underappreciated or even inappropriately ignored in the gold rush for genome-wide statistical significance. Improvements in prediction based on the current markers are small, if at all present. Clinical portent is not yet sufficiently established. Although one can be excited about the new possibilities for more discoveries, incorporation of these markers in every day routine clinical practice and public health cannot be justified currently.


    Acknowledgments
 
Sources of Funding

No funding was received for this review.

Disclosures

None.


    References
 Top
 Abstract
 Introduction
 Conclusions
 References
 
1. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008; 9: 356–369.[CrossRef][Medline]

2. Hindorff LA, Junkins HA, Manolio TA. A catalog of published genome-wide association studies. Available at: www.genome.gov/26525384. Accessed September 20, 2008.

3. Hoggart CJ, Clark TG, De Iorio M, Whittaker JC, Balding DJ. Genome-wide significance for dense SNP and resequencing data. Genet Epidemiol. 2008; 32: 179–185.[CrossRef][Medline]

4. Pe'er I, Yelensky R, Altshuler D, Daly MJ. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol. 2008; 32: 381–385.[CrossRef][Medline]

5. Loos RJ, Lindgren CM, Li S, Wheeler E, Zhao JH, Prokopenko I, Inouye M, Freathy RM, Attwood AP, Beckmann JS, Berndt SI; Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial, Jacobs KB, Chanock SJ, Hayes RB, Bergmann S, Bennett AJ, Bingham SA, Bochud M, Brown M, Cauchi S, Connell JM, Cooper C, Smith GD, Day I, Dina C, De S, Dermitzakis ET, Doney AS, Elliott KS, Elliott P, Evans DM, Sadaf Farooqi I, Froguel P, Ghori J, Groves CJ, Gwilliam R, Hadley D, Hall AS, Hattersley AT, Hebebrand J, Heid IM; KORA, Lamina C, Gieger C, Illig T, Meitinger T, Wichmann HE, Herrera B, Hinney A, Hunt SE, Jarvelin MR, Johnson T, Jolley JD, Karpe F, Keniry A, Khaw KT, Luben RN, Mangino M, Marchini J, McArdle WL, McGinnis R, Meyre D, Munroe PB, Morris AD, Ness AR, Neville MJ, Nica AC, Ong KK, O'Rahilly S, Owen KR, Palmer CN, Papadakis K, Potter S, Pouta A, Qi L; Nurses' Health Study, Randall JC, Rayner NW, Ring SM, Sandhu MS, Scherag A, Sims MA, Song K, Soranzo N, Speliotes EK; Diabetes Genetics Initiative, Syddall HE, Teichmann SA, Timpson NJ, Tobias JH, Uda M; SardiNIA Study, Vogel CI, Wallace C, Waterworth DM, Weedon MN; Wellcome Trust Case Control Consortium, Willer CJ; FUSION, Wraight, Yuan X, Zeggini E, Hirschhorn JN, Strachan DP, Ouwehand WH, Caulfield MJ, Samani NJ, Frayling TM, Vollenweider P, Waeber G, Mooser V, Deloukas P, McCarthy MI, Wareham NJ, Barroso I, Jacobs KB, Chanock SJ, Hayes RB, Lamina C, Gieger C, Illig T, Meitinger T, Wichmann HE, Kraft P, Hankinson SE, Hunter DJ, Hu FB, Lyon HN, Voight BF, Ridderstrale M, Groop L, Scheet P, Sanna S, Abecasis GR, Albai G, Nagaraja R, Schlessinger D, Jackson AU, Tuomilehto J, Collins FS, Boehnke M, Mohlke KL. Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat Genet. 2008; 40: 768–775.[CrossRef][Medline]

6. Thorgeirsson TE, Geller F, Sulem P, Rafnar T, Wiste A, Magnusson KP, Manolescu A, Thorleifsson G, Stefansson H, Ingason A, Stacey SN, Bergthorsson JT, Thorlacius S, Gudmundsson J, Jonsson T, Jakobsdottir M, Saemundsdottir J, Olafsdottir O, Gudmundsson LJ, Bjornsdottir G, Kristjansson K, Skuladottir H, Isaksson HJ, Gudbjartsson T, Jones GT, Mueller T, Gottsäter A, Flex A, Aben KK, de Vegt F, Mulders PF, Isla D, Vidal MJ, Asin L, Saez B, Murillo L, Blondal T, Kolbeinsson H, Stefansson JG, Hansdottir I, Runarsdottir V, Pola R, Lindblad B, van Rij AM, Dieplinger B, Haltmayer M, Mayordomo JI, Kiemeney LA, Matthiasson SE, Oskarsson H, Tyrfingsson T, Gudbjartsson DF, Gulcher JR, Jonsson S, Thorsteinsdottir U, Kong A, Stefansson K. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature. 2008; 452: 638–642.[CrossRef][Medline]

7. Zeggini E, Scott LJ, Saxena R, Voight BF, Marchini JL, Hu T, de Bakker PI, Abecasis GR, Almgren P, Andersen G, Ardlie K, Boström KB, Bergman RN, Bonnycastle LL, Borch-Johnsen K, Burtt NP, Chen H, Chines PS, Daly MJ, Deodhar P, Ding CJ, Doney AS, Duren WL, Elliott KS, Erdos MR, Frayling TM, Freathy RM, Gianniny L, Grallert H, Grarup N, Groves CJ, Guiducci C, Hansen T, Herder C, Hitman GA, Hughes TE, Isomaa B, Jackson AU, Jørgensen T, Kong A, Kubalanza K, Kuruvilla FG, Kuusisto J, Langenberg C, Lango H, Lauritzen T, Li Y, Lindgren CM, Lyssenko V, Marvelle AF, Meisinger C, Midthjell K, Mohlke KL, Morken MA, Morris AD, Narisu N, Nilsson P, Owen KR, Palmer CN, Payne F, Perry JR, Pettersen E, Platou C, Prokopenko I, Qi L, Qin L, Rayner NW, Rees M, Roix JJ, Sandbaek A, Shields B, Sjögren M, Steinthorsdottir V, Stringham HM, Swift AJ, Thorleifsson G, Thorsteinsdottir U, Timpson NJ, Tuomi T, Tuomilehto J, Walker M, Watanabe RM, Weedon MN, Willer CJ; Wellcome Trust Case Control Consortium, Illig T, Hveem K, Hu FB, Laakso M, Stefansson K, Pedersen O, Wareham NJ, Barroso I, Hattersley AT, Collins FS, Groop L, McCarthy MI, Boehnke M, Altshuler D. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet. 2008; 40: 638–645.[CrossRef][Medline]

8. Sandhu MS, Waterworth DM, Debenham SL, Wheeler E, Papadakis K, Zhao JH, Song K, Yuan X, Johnson T, Ashford S, Inouye M, Luben R, Sims M, Hadley D, McArdle W, Barter P, Kesäniemi YA, Mahley RW, McPherson R, Grundy SM; Wellcome Trust Case Control Consortium, Bingham SA, Khaw KT, Loos RJ, Waeber G, Barroso I, Strachan DP, Deloukas P, Vollenweider P, Wareham NJ, Mooser V. LDL-cholesterol concentrations: a genome-wide association study. Lancet. 2008; 371: 483–491.[CrossRef][Medline]

9. Kathiresan S, Melander O, Guiducci C, Surti A, Burtt NP, Rieder MJ, Cooper GM, Roos C, Voight BF, Havulinna AS, Wahlstrand B, Hedner T, Corella D, Tai ES, Ordovas JM, Berglund G, Vartiainen E, Jousilahti P, Hedblad B, Taskinen MR, Newton-Cheh C, Salomaa V, Peltonen L, Groop L, Altshuler DM, Orho-Melander M. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat Genet. 2008; 40: 189–197.[Medline]

10. Willer CJ, Sanna S, Jackson AU, Scuteri A, Bonnycastle LL, Clarke R, Heath SC, Timpson NJ, Najjar SS, Stringham HM, Strait J, Duren WL, Maschio A, Busonero F, Mulas A, Albai G, Swift AJ, Morken MA, Narisu N, Bennett D, Parish S, Shen H, Galan P, Meneton P, Hercberg S, Zelenika D, Chen WM, Li Y, Scott LJ, Scheet PA, Sundvall J, Watanabe RM, Nagaraja R, Ebrahim S, Lawlor DA, Ben-Shlomo Y, Davey-Smith G, Shuldiner AR, Collins R, Bergman RN, Uda M, Tuomilehto J, Cao A, Collins FS, Lakatta E, Lathrop GM, Boehnke M, Schlessinger D, Mohlke KL, Abecasis GR. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat Genet. 2008; 40: 161–169.[Medline]

11. Samani NJ, Erdmann J, Hall AS, Hengstenberg C, Mangino M, Mayer B, Dixon RJ, Meitinger T, Braund P, Wichmann HE, Barrett JH, König IR, Stevens SE, Szymczak S, Tregouet DA, Iles MM, Pahlke F, Pollard H, Lieb W, Cambien F, Fischer M, Ouwehand W, Blankenberg S, Balmforth AJ, Baessler A, Ball SG, Strom TM, Braenne I, Gieger C, Deloukas P, Tobin MD, Ziegler A, Thompson JR, Schunkert H; WTCCC and the Cardiogenics Consortium. Genomewide association analysis of coronary artery disease. N Engl J Med. 2007; 357: 443–453.[Abstract/Free Full Text]

12. Gudbjartsson DF, Arnar DO, Helgadottir A, Gretarsdottir S, Holm H, Sigurdsson A, Jonasdottir A, Baker A, Thorleifsson G, Kristjansson K, Palsson A, Blondal T, Sulem P, Backman VM, Hardarson GA, Palsdottir E, Helgason A, Sigurjonsdottir R, Sverrisson JT, Kostulas K, Ng MC, Baum L, So WY, Wong KS, Chan JC, Furie KL, Greenberg SM, Sale M, Kelly P, MacRae CA, Smith EE, Rosand J, Hillert J, Ma RC, Ellinor PT, Thorgeirsson G, Gulcher JR, Kong A, Thorsteinsdottir U, Stefansson K. Variants conferring risk of atrial fibrillation on chromosome 4q25. Nature. 2007; 448: 353–357.[CrossRef][Medline]

13. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007; 447: 661–678.[CrossRef][Medline]

14. Helgadottir A, Thorleifsson G, Manolescu A, Gretarsdottir S, Blondal T, Jonasdottir A, Jonasdottir A, Sigurdsson A, Baker A, Palsson A, Masson G, Gudbjartsson DF, Magnusson KP, Andersen K, Levey AI, Backman VM, Matthiasdottir S, Jonsdottir T, Palsson S, Einarsdottir H, Gunnarsdottir S, Gylfason A, Vaccarino V, Hooper WC, Reilly MP, Granger CB, Austin H, Rader DJ, Shah SH, Quyyumi AA, Gulcher JR, Thorgeirsson G, Thorsteinsdottir U, Kong A, Stefansson K. A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science. 2007; 316: 1491–1493.[Abstract/Free Full Text]

15. Zeggini E, Weedon MN, Lindgren CM, Frayling TM, Elliott KS, Lango H, Timpson NJ, Perry JR, Rayner NW, Freathy RM, Barrett JC, Shields B, Morris AP, Ellard S, Groves CJ, Harries LW, Marchini JL, Owen KR, Knight B, Cardon LR, Walker M, Hitman GA, Morris AD, Doney AS; Wellcome Trust Case Control Consortium (WTCCC), McCarthy MI, Hattersley AT. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science. 2007; 316: 1336–1341.[Abstract/Free Full Text]

16. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, Perry JR, Elliott KS, Lango H, Rayner NW, Shields B, Harries LW, Barrett JC, Ellard S, Groves CJ, Knight B, Patch AM, Ness AR, Ebrahim S, Lawlor DA, Ring SM, Ben-Shlomo Y, Jarvelin MR, Sovio U, Bennett AJ, Melzer D, Ferrucci L, Loos RJ, Barroso I, Wareham NJ, Karpe F, Owen KR, Cardon LR, Walker M, Hitman GA, Palmer CN, Doney AS, Morris AD, Smith GD, Hattersley AT, McCarthy MI. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007; 316: 889–894.[Abstract/Free Full Text]

17. Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, Boutin P, Vincent D, Belisle A, Hadjadj S, Balkau B, Heude B, Charpentier G, Hudson TJ, Montpetit A, Pshezhetsky AV, Prentki M, Posner BI, Balding DJ, Meyre D, Polychronakos C, Froguel P. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature. 2007; 445: 881–885.[CrossRef][Medline]

18. Arking DE, Pfeufer A, Post W, Kao WH, Newton-Cheh C, Ikeda M, West K, Kashuk C, Akyol M, Perz S, Jalilzadeh S, Illig T, Gieger C, Guo CY, Larson MG, Wichmann HE, Marbán E, O'Donnell CJ, Hirschhorn JN, Kääb S, Spooner PM, Meitinger T, Chakravarti A. A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization. Nat Genet. 2006; 38: 644–651.[CrossRef][Medline]

19. Ioannidis JP, Patsopoulos NA, Evangelou E. Uncertainty in heterogeneity estimates in meta-analyses. BMJ. 2007; 335: 914–916.[Free Full Text]

20. Ioannidis JP, Patsopoulos NA, Evangelou E. Heterogeneity in meta-analyses of genome-wide association investigations. PLoS ONE. 2007; 2: e841.[CrossRef][Medline]

21. Sutton AJ, Higgins JP. Recent developments in meta-analysis. Stat Med. 2008; 27: 625–650.[CrossRef][Medline]

22. Ng MC, Park KS, Oh B, Tam CH, Cho YM, Shin HD, Lam VK, Ma RC, So WY, Cho YS, Kim HL, Lee HK, Chan JC, Cho NH. Implication of genetic variants near TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/B, IGF2BP2, and FTO in type 2 diabetes and obesity in 6,719 Asians. Diabetes. 2008; 57: 2226–2233.[Abstract/Free Full Text]

23. Hinohara K, Nakajima T, Takahashi M, Hohda S, Sasaoka T, Nakahara K, Chida K, Sawabe M, Arimura T, Sato A, Lee BS, Ban JM, Yasunami M, Park JE, Izumi T, Kimura A. Replication of the association between a chromosome 9p21 polymorphism and coronary artery disease in Japanese and Korean populations. J Hum Genet. 2008; 53: 357–359.[CrossRef][Medline]

24. Hiura Y, Fukushima Y, Yuno M, Sawamura H, Kokubo Y, Okamura T, Tomoike H, Goto Y, Nonogi H, Takahashi R, Iwai N. Validation of the association of genetic variants on chromosome 9p21 and 1q41 with myocardial infarction in a Japanese population. Circ J. 2008; 72: 1213–1217.[CrossRef][Medline]

25. Tan JT, Dorajoo R, Seielstad M, Sim XL, Ong RT, Chia KS, Wong TY, Saw SM, Chew SK, Aung T, Tai ES. FTO variants are associated with obesity in the Chinese and Malay populations in Singapore. Diabetes. 2008; 57: 2851–2857.[Abstract/Free Full Text]

26. Pan Z, Trikalinos TA, Kavvoura FK, Lau J, Ioannidis JP. Local literature bias in genetic epidemiology: an empirical evaluation of the Chinese literature. PLoS Med. 2005; 2: e334.[CrossRef][Medline]

27. Zollner S, Pritchard JK. Overcoming the winner’s curse: estimating penetrance parameters from case-control data. Am J Hum Genet. 2007; 80: 605–615.[CrossRef][Medline]

28. Ioannidis JP. Why most discovered true associations are inflated. Epidemiology. 2008; 19: 640–648.[CrossRef][Medline]

29. Schunkert H, Götz A, Braund P, McGinnis R, Tregouet DA, Mangino M, Linsel-Nitschke P, Cambien F, Hengstenberg C, Stark K, Blankenberg S, Tiret L, Ducimetiere P, Keniry A, Ghori MJ, Schreiber S, El Mokhtari NE, Hall AS, Dixon RJ, Goodall AH, Liptau H, Pollard H, Schwarz DF, Hothorn LA, Wichmann HE, König IR, Fischer M, Meisinger C, Ouwehand W, Deloukas P, Thompson JR, Erdmann J, Ziegler A, Samani NJ; Cardiogenics Consortium. Repeated replication and a prospective meta-analysis of the association between chromosome 9p21.3 and coronary artery disease. Circulation. 2008; 117: 1675–1684.[Abstract/Free Full Text]

30. Moonesinghe R, Khoury MJ, Liu T, Ioannidis JP. Required sample size and nonreplicability thresholds for heterogeneous genetic associations. Proc Natl Acad Sci USA. 2008; 105: 617–622.[Abstract/Free Full Text]

31. Rockhill B, Newman B, Weinberg C. Use and misuse of population attributable fractions. Am J Public Health. 1998; 88: 15–19.[Free Full Text]

32. Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008; 40: 695–701.[CrossRef][Medline]

33. Lango H; The UK Type 2 Diabetes Genetics Consortium, Palmer CN, Morris AD, Zeggini E, Hattersley AT, McCarthy MI, Frayling TM, Weedon MN. Assessing the combined impact of 18 common genetic variants of modest effect sizes on type 2 diabetes risk. Diabetes. 2008; 57: 3129–3135.[Abstract/Free Full Text]

34. Wilson PW, D'Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation. 1998; 97: 1837–1847.[Abstract/Free Full Text]

35. Kathiresan S, Melander O, Anevski D, Guiducci C, Burtt NP, Roos C, Hirschhorn JN, Berglund G, Hedblad B, Groop L, Altshuler DM, Newton-Cheh C, Orho-Melander M. Polymorphisms associated with cholesterol and risk of cardiovascular events. N Engl J Med. 2008; 358: 1240–1249.[Abstract/Free Full Text]

36. Pencina MJ, D'Agostino RB Sr, D'Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008; 27: 157–172;discussion 207–212.[CrossRef][Medline]

37. Talmud PJ, Cooper JA, Palmen J, Lovering R, Drenos F, Hingorani AD, Humphries SE. Chromosome 9p21.3 coronary heart disease locus genotype and prospective risk of CHD in healthy middle-aged men. Clin Chem. 2008; 54: 467–474.[Abstract/Free Full Text]

38. Paynter NP, Chasman DI, Buring JE, Shiffman D, Cook NR, Ridker PM. Cardiovascular disease risk prediction with and without knowledge of genetic variation at chromosome 9p21.3: The Women’s Genome Health Study. Ann Intern Med. In press.

39. Ioannidis JP. Personalized genetic prediction: too limited, too expensive, or too soon? Ann Intern Med. In press.

40. Manolio TA, Brooks LD, Collins FS. A HapMap harvest of insights into the genetics of common disease. J Clin Invest. 2008; 118: 1590–1605.[CrossRef][Medline]

41. Ntzani EE, Rizos EC, Ioannidis JP. Genetic effects versus bias for candidate polymorphisms in myocardial infarction: case study and overview of large-scale evidence. Am J Epidemiol. 2007; 165: 973–984.[Abstract/Free Full Text]

42. Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG. Replication validity of genetic association studies. Nat Genet. 2001; 29: 306–309.[CrossRef][Medline]

43. Morgan TM, Krumholz HM, Lifton RP, Spertus JA. Nonvalidation of reported genetic risk factors for acute coronary syndrome in a large-scale replication study. JAMA. 2007; 297: 1551–1561.[Abstract/Free Full Text]


 

CLINICAL PERSPECTIVE

A large number of common genetic variants associated with diverse phenotypes have been identified recently. A systematic review was performed of genome-wide association-derived markers associated with cardiovascular outcomes or other phenotypes that represent common established risk factors for cardiovascular outcomes. The review examined 28 independent associations with the very strongest statistical support, pertaining to coronary artery disease, myocardial infarction, atrial fibrillation/flutter, prolongation of QT interval, as well as type 2 diabetes, body mass index, high-density lipoprotein levels, low-density lipoprotein levels, and nicotine dependence. Between-study heterogeneity in the genetic effects seems common and it would pose a challenge to generalizability across different populations for these markers. Effect sizes are small and may be even smaller in subsequent replications and meta-analysis. Individualized risk measures are typically very small (proportion of variance explained <1% per marker). When used in conjunction with traditional predictors, improvement in overall prediction (eg, area under the curve) or risk reclassification is limited, and subject to methodological caveats. Despite the excitement about these new discoveries, use for predictive purposes in clinical practice would probably be premature.


Related Articles

Contemporary Approaches to Gene Discovery: Progress Toward Personalized Medicine?
Ingrid B. Borecki
Circ Cardiovasc Genet 2009 2: 1-2. [Extract] [Full Text] [PDF]

Prediction of Cardiovascular Disease Outcomes and Established Cardiovascular Risk Factors by Genome-Wide Association Markers
John P.A. Ioannidis
Circ Cardiovasc Genet 2009 2: 7-15. [Abstract] [Full Text] [PDF]



This article has been cited by other articles:


Home page
Circ Cardiovasc GenetHome page
D. Corella and J. M. Ordovas
Nutrigenomics in Cardiovascular Medicine
Circ Cardiovasc Genet, December 1, 2009; 2(6): 637 - 651.
[Full Text] [PDF]


Home page
Sci Transl MedHome page
J. P. A. Ioannidis, E. Y. Loy, R. Poulton, and K. S. Chia
Researching Genetic Versus Nongenetic Determinants of Disease: A Comparison and Proposed Unification
Science Translational Medicine, November 18, 2009; 1(7): 7ps8 - 7ps8.
[Full Text] [PDF]


Home page
Circ Cardiovasc GenetHome page
I. B. Borecki
Contemporary Approaches to Gene Discovery: Progress Toward Personalized Medicine?
Circ Cardiovasc Genet, February 1, 2009; 2(1): 1 - 2.
[Full Text] [PDF]


This Article
Free upon publication Free Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
2/1/7    most recent
CIRCGENETICS.108.833392v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Ioannidis, J. P.A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ioannidis, J. P.A.
Related Collections
Right arrow Clinical genetics
Right arrow Risk Factors
Right arrow Genomics
Right arrow Epidemiology
Right arrow Genetics of cardiovascular disease
Right arrowRelated Articles