Genome-Wide Association Analysis of High-Density Lipoprotein Cholesterol in the Population-Based KORA Study Sheds New Light on Intergenic RegionsCLINICAL PERSPECTIVE
Background— High-density lipoprotein cholesterol (HDLC) is a strong risk factor for atherosclerosis and is assumed to be under considerable genetic control. We aimed to identify gene regions that influence HDLC levels by a genome-wide association analysis in the population-based KORA (Cooperative Health Research in the Region of Augsburg) study.
Methods and Results— In KORA S3/F3 (n=1643), we analyzed 377 865 quality-checked single-nucleotide polymorphisms (SNPs; 500K, Affymetrix, Santa Clara, Calif), complemented by the publicly available genome-wide association results from the Diabetes Genetics Initiative (n=2631) and by replication data from KORA S4 (n=4037) and the Copenhagen City Heart Study (n=9205). Among the 13 SNPs selected from the KORA S3/F3 500K probability value list, 3 showed consistent associations in subsequent replications: 1 SNP 10 kb upstream of CETP (pooled probability value=8.5×10−27), 1 SNP approximately 40 kb downstream of LIPG (probability value=4.67×10−10), both independent of previously reported SNPs, and 1 from an already reported region of LPL (probability value=2.82×10−11). Bioinformatical analyses indicate a potential functional relevance of the respective SNPs.
Conclusions— The present genome-wide association study identified 2 interesting HDLC-relevant regions upstream of CETP and downstream of LIPG. This draws attention to the importance of long-range effects of intergenic regions, which have been underestimated so far, and may impact future candidate-gene–association studies toward extending the region analyzed. Furthermore, the present study reinforced CETP and LPL as HDLC genes and thereby underscores the power of this type of genome-wide association approach to pinpoint associations of common polymorphisms with effects explaining as little as 0.5% of the HDLC variance in the general population.
Received February 29, 2008; accepted June 13, 2008.
High-density lipoprotein (HDL) particles exhibit multiple antiatherogenic effects,1 and HDL cholesterol (HDLC) concentrations show a strong inverse correlation with the risk of coronary artery disease.2 Epidemiological studies have highlighted the antiatherogenic function of HDLC and showed that an increase of 1 mg/dL in HDLC levels is associated with a 2% and 3% decrease in risk of coronary artery disease in men and women, respectively.3,4 The main function of HDL lies in its ability to remove cholesterol from peripheral tissue (eg, macrophages) and transport it to the liver or other tissues in need of large amounts of cholesterol. This complex process is called reverse cholesterol transport and is thought to represent the basis for the antiatherogenic properties of HDL.1,5 Furthermore, HDL has antioxidative properties due to associated antioxidative enzymes and expresses antiinflammatory activity in various pathways.
Editorial p 3
Clinical Perspective p 20
In addition to the documented association of low HDLC states with elevated risk for coronary artery disease, high and therefore potentially protective HDLC states have also been reported. Although the pharmacological treatment and prevention of atherosclerosis during the past several decades has focused mainly on lowering low-density lipoprotein cholesterol levels, several pharmacological intervention strategies are under development that aim to increase HDLC levels.4 For example, CETP deficiency is the basis for the development of new drugs to increase HDLC levels.6 However, one of those drugs (torcetrapib) recently was shown to have side effects such as hypertension, which resulted in withdrawal of clinical trials.7 If increasing HDLC levels or changing the quality of HDL particles indeed works to influence coronary artery disease, and if a main cause of the failure of torcetrapib was the lack of knowledge about cofactors that are required for functional HDL, the search for additional genes as potential new drug targets will be a major goal of future studies.
HDLC is under considerable genetic control, with heritability estimates of up to 80%.8,9 At least 10 candidate genes are reported to be associated with HDL concentrations,6 but even more are involved in the pathways that regulate reverse cholesterol transport.5 Genetic factors that influence HDLC plasma levels that have been confirmed by epidemiological studies include mutations in ABCA1, CETP, LPL, LIPG, LIPC, LCAT, SCARB1, and polymorphisms in the APOA1/C3/A4/A5 gene cluster (for a systematic overview, see Table 1 and the online-only Data Supplement).
There are several advantages to analyzing HDLC as a quantitative phenotype in a representative population-based sample of subjects. The quantitative nature of the phenotype increases the power of the study considerably. The use of a general population sample reduces the number of subjects taking antilipidemic medication compared with patient groups. The KORA (Cooperative Health Research in the Region of Augsburg) study is of such a design, and the 500K Affymetrix single-nucleotide polymorphism (SNP) panel was applied to genotype a representative subset of the third KORA survey (KORA S3/F3 500K Study). The present analysis strategy was further complemented by the publicly available genome-wide association (GWA) study from the Diabetes Genetics Initiative (DGI; http://www.broad.mit.edu/diabetes/scandinavs/), and initial GWA signals were replicated in the fourth KORA survey (KORA S4) and the Copenhagen City Heart Study. Together, this provides a set of strong epidemiological studies with which to investigate genetic SNP associations with HDLC levels on a genome-wide scale by use of a hypothesis-free approach complemented by a priori knowledge of HDLC candidate genes.
Study Steps and SNP Selection Strategies
We conducted a GWA study with subsequent replication to avoid false-positive findings. Figure 1 illustrates the 3 steps (a GWA analysis, then a first and a second replication level). We implemented 2 parallel SNP selection strategies for follow-up in the replication levels: for the first strategy we analyzed the Affymetrix 500K SNPs in 1643 subjects from the population-representative KORA S3/F3 cohort10 with HDLC values available (measured during the F3 follow-up) and selected the best SNPs top down from the list of probability values <10−4, as well as SNPs representative of known HDLC candidate genes with P<10−3 for replication in 4037 subjects from KORA S4. We later added the DGI, which included 1348 control subjects and 1283 patients with type 2 diabetes mellitus (genotyped by Affymetrix 500K) and which was made publicly available in April 2007: for the second strategy we used the HDLC GWA results from both KORA S3/F3 and DGI and selected SNPs with probability values <0.01 in both, as well as estimates pointing in the same direction for replication in KORA S4. The significant SNPs from gene regions not yet reported to be associated with HDLCthat emerged from either selection strategy were planned to be carried forward to the second replication level, the Copenhagen City Heart Study (n=9205). A detailed description of the studies and genotyping methods is provided in the Data Supplement.
Statistical and Bioinformatical Analysis
For the KORA S3/F3 500K sample, we used linear regression, applying an additive genetic model adjusted for age and sex to test each SNP for association with the quantitative phenotype HDLC. Details are provided in the Data Supplement. For the DGI sample, we used the publicly available probability values and estimates derived by linear regression adjusted for age, age2, sex, body mass index, study center, and diabetes status.
Replication Sample Analysis
For each SNP, we applied linear regression models adjusted for age and sex that incorporated the additive genetic effect model.
Imputed SNP Data
For gene regions of specific interest, we used imputed SNP data with the MaCH approach and accounted for imputation uncertainty in the association analysis (see Data Supplement for details).
To indicate the potentially functional relevance of regions identified by our GWA analysis, we investigated evolutionary conservation between species and searched for possible regulatory elements such as known microRNAs, predicted transcription factor binding sites, known enhancer regions, ENCODE (ENCyclopedia Of DNA Elements) transcripts, or RNA secondary structures. For this purpose, we also used the ESPERR (Evolutionary and Sequence Pattern Extraction through Reduced Representations) regulatory potential value (for details, see Data Supplement).
Candidate-Gene SNP Identification
Before GWA data analysis, we searched the literature for candidate genes reported in epidemiological studies to be associated with HDLC. If at least 1 SNP of a gene was reported at least twice for association in a large (>500 subjects) study, the gene was considered an HDLC candidate gene.
The authors had full access to the data and take full responsibility for the integrity of the data. All authors have read and agree to the manuscript as written.
Characteristics of Study Participants
The participants of the 3 involved population-based studies (KORA S3/F3, KORA S4, and the Copenhagen City Heart Study) displayed very similar characteristics (Data Supplement Table I). With a λ-inflation factor of 0.99, there was clearly no indication of population stratification in the KORA S3/F3 500K sample.
Literature-Reported HDLC Candidate Genes
Our literature research revealed 10 genes with SNPs previously reported to be associated with HDLC in at least 2 substantially large studies (Table 1). Epidemiological evidence was strong for CETP, LPL, LIPC, and APOA5. It was less strong or sparse for LIPG, LCAT, APOA1, APOC3, SCARB1, and ABCA1 in contrast to clear functional results or observations in patients with gene mutations with pronounced influences on the gene product.
Results of KORA S3/F3 GWA Analysis
Figure 2 summarizes the results of the main KORA S3/F3 500K analysis. In particular, the quantile-quantile plot illustrates observed significant associations beyond those expected by chance. The specific results for all SNPs with P<10−4 and for additional SNPs from HDLC candidate genes with P<10−3 are summarized in Data Supplement Table IIIA. We detected 54 SNPs with P<10−4 (38 expected by chance), including several SNPs within or around 4 of the 10 HDLC candidate genes, namely, CETP, LPL, LIPG, and SCARB1.
From this probability value–ranked “hit list,” we selected 4 SNPs representative of the 4 candidate genes and 12 SNPs top down from as yet unreported HDLC genes (until P>2.0×10−5) and successfully genotyped 13 of these 16 SNPs in the replication sample KORA S4. The minor allele frequencies of all SNPs were consistent in KORA S3/F3 and KORA S4. We found 3 (CETP, LPL, and LIPG) of the 4 HDLC candidate-gene SNPs to be highly significantly associated with HDLC, whereas the SCARB1 SNP did not replicate (Table 2). The later-added DGI estimates for these 4 SNPs were consistent with KORA S4 estimates, and pooled probability values reached genome-wide significance for the SNP near CETP (8.5×10−27), LPL (2.82×10−11), and LIPG (4.67×10−10). All effects were consistent across sex and 10-year age groups (data not shown).
None of the 9 SNPs in the as yet unreported HDLC genes yielded significant associations at the 0.05/9=0.006 significance level (Data Supplement Table IIIB). Because the UBXD2 SNP rs16831992 showed strong effects in the same direction in both GWA samples, this SNP was selected for additional genotyping in the Copenhagen City Heart Study, but it did not replicate (Data Supplement Table IIIB).
Results From the Combined KORA S3/F3 and DGI GWA Studies
Data Supplement Table IVA lists 16 gene loci that include 19 SNPs with P<0.01 in both the KORA S3/F3 and the DGI GWA study with estimates pointing in the same direction. Ignoring the 5 SNPs in CETP and LPL, we successfully genotyped 13 SNPs in the replication sample KORA S4; none of these SNPs showed significant associations at the 0.05/13=0.004 significance level (Data Supplement Table IVB).
A Closer Look at the Gene Regions of Interest by Use of the Imputed SNP Panel
To clarify whether the identified SNP hits (or highly correlated SNPs) were reported previously or whether these SNPs pinpoint new independent gene regions, we used the imputed SNP data to investigate linkage disequilibrium (LD) and block structure, as well as the independence of SNP associations from previously reported associations (see Data Supplement, Statistical Methodology). Figures 3A through 3⇓⇓C depict the probability values for HDL association, with color coding of the correlation of SNPs with the top KORA S3/F3 500K hits and marking of the reported and replicated SNPs. These figures also elucidate the location of neighboring genes and block structure.
From Figure 3⇑⇑A, it can be seen that the CETP 500K top hit (rs1800775) was already reported to be associated with HDLC. In contrast, the second-best 500K SNP hit (rs9989419) appeared to be independent from the already known rs1800775, which is underscored by the lack of correlation (r2<0.20) between the 2 markers and the fact that the rs9989419 association remained virtually unchanged after adjustment for rs1800775. In fact, rs9989419 was independent from all other SNPs (r2<0.3; see more detailed LD plot in Data Supplement Figure I and the LD bin in Data Supplement Table V). The CETP gene appeared to be located in 1 LD block that began approximately 10 kb downstream of rs9989419; rs9989419 was located in a recombination hot spot between LD blocks. This points toward a different functional entity of this SNP compared with the functionally relevant known SNPs in the promoter and the translated region of CETP.
For LPL (Figure 3⇑⇑B), most of the highly associated SNPs were correlated with each other and with the 3 reported SNPs (rs328, rs320, and rs13702). For LIPG (Figure 3⇑⇑C), the observed significant associations were for SNPs approximately 40 to 70 kb downstream of LIPG, and thus, we extended the analyzed region to 225 kb downstream to also include the neighboring gene ACAA2. Three SNPs among the genotyped and imputed SNPs were reported in the literature. All 3 were located in the gene, and none were correlated (r2<0.20) to the LIPG 500K top hits (rs7240405, rs2156552, rs1943981, and rs4939883; all correlated with each other). Most importantly, adjustment of the LIPG 500K top hits for the reported SNPs did not change the observed association. This may pinpoint an HDLC-relevant gene region 40 to 70 kb downstream of LIPG outside the previous focus of candidate-gene studies.
HDLC Candidate-Gene Regions Including Imputed SNP Data
To elucidate the coverage of the HDLC candidate genes among the genotyped and imputed SNPs and to highlight the number of “reinforced” HDLC candidate genes, Data Supplement Table VI summarizes the results of our literature review for specific SNPs in previously reported HDLC candidate genes. Among the 55 literature-reported SNPs for the 10 identified HDLC candidate genes, 2 SNPs were directly covered by the 500K array and a further 25 by the imputed genotype data. These 27 SNPs were located in the gene regions of CETP, LPL, LIPG, LIPC, ABCA1, LCAT, and APOA5. Of these, 8 SNPs showed probability values <0.0005 (CETP, LPL), and 4 SNPs had probability values of 0.03 (CETP, LIPC) for association with HDLC. The other 15 SNPs exhibited probability values >0.05. More detailed results are provided in Data Supplement Table V. When extending this analysis not only to the reported SNPs covered by the genotyped and imputed SNPs but to all available imputed SNPs of all other HDLC candidate genes, we found minor associations in the other HDLC candidate genes, which would not yield statistical significance considering the multiple tests performed (minimal probability values for LIPC, APOA1/C3/A4/A5 gene cluster, ABCA1, and LCAT were 0.002, 0.01, 0.007, and 0.02, respectively).
Bioinformatical Analysis on Functional Relevance
For detailed results of our bioinformatical analysis, see the Data Supplement, including Figures III through VII and Tables VII and VIII. No known regulatory regions, promoter-specific sequences, or noncoding RNAs were found by Genomatix software (Ann Arbor, Mich) within the regions that showed the strongest associations in the present GWA analysis; thus, we focused on searching for putative, as yet unknown regulatory sites by analyzing the conservational pattern and ESPERR scores in these regions.
Because CETP activity is not present in most mammals, it was not surprising that conservation analysis was inconclusive. The region around rs9989419 approximately 10 kb upstream of CETP and located in a recombination hot spot between LD blocks was found to have a high regulatory potential based on ESPERR scores.
For LIPG, the region with our strongest associations (approximately 40 to 70 kb downstream of the gene) showed several conserved regions. One additional short, ≈600-bp highly conserved region with a high regulatory potential was seen immediately upstream of this region. Five transcription factor binding sites were predicted for this short region, 2 of which were binding sites for peroxisome proliferator-activated receptor factors, which have repeatedly been reported to be involved in cholesterol metabolism. Furthermore, a binding site for HNF1 was detected even further downstream, within the region with the strongest association in the present study.
Although for LPL, the evidence for intergenic conservation, regulatory potential, and transcription factor binding sites was generally less pronounced than for LIPG, we found that our best hit, rs17482753, was located in a 2-kb conserved region approximately 6 kb downstream of the gene. Furthermore, 2 databases predicted that the minor allele of rs3289 created a microRNA-145 binding site, which was shown to be strongly expressed in adipose tissue by mouse expression profiles.
On the basis of 2 GWA scans for HDLC that included a total of 4274 subjects and subsequent replication of the highest-scoring SNPs in an additional population-based sample of 4037 subjects, we identified an HDLC-relevant gene region 40 to 70 kb downstream of the LIPG gene not yet indicated by any candidate-gene studies. Furthermore, we pinpointed a potential functional relevance of a polymorphism approximately 10 kb upstream of the CETP gene, which was associated with HDLC independently of the numerous already reported CETP SNPs. Although CETP and, to a lesser extent, LIPG are 2 well-described genes that modulate HDLC in the general population, the present results draw attention to the importance of independent intergenic regions, which has been underestimated thus far.
Findings Regarding Known Candidate Genes
The present study can be considered as proof of principle that GWA analyses successfully identify genes for complex quantitative phenotypes such as HDLC levels in a population-based design. A thorough literature search for genetic epidemiological studies before the present GWA analysis identified 10 reported HDLC candidate genes. We were able to “reinforce” 2 of these genes: CETP reached genome-wide significance in our GWA analysis, and several LPL SNPs exhibited strong “experiment-wise” significance when the replication data were added. The other reported HDLC candidate-gene SNPs were not covered by SNPs on the 500K chip (LCAT), or none (LIPC, LIPG, APOA5, APOA1, and APOC3) or just 2 (ABCA1) of the SNPs reported in the literature were covered by the 500K chip, a well-known limitation of this chip (Data Supplement Table VI). When using the HapMap-based imputed SNP panel within and around the gene regions of interest, we found minor associations with all of these genes, although without statistical significance given the multiple tests performed.
A recent meta-analysis that combined 3 genome-wide scans totaling 8816 individuals identified 5 of the 10 candidate genes (CETP, LPL, LIPC, LIPG, and ABCA1) with genome-wide significance.11,12 This is in accordance with the power considerations of the present study, which indicated that the present KORA S3/F3 GWA analysis had 80% power to detect SNPs explaining 2% of the variance in HDLC on a genome-wide significance level (1.7×10−7), which corresponded to an HDLC difference of 3.9 mg/dL per minor allele with 25% frequency (eg, CETP effect). The GWA analysis power was only 20% without the replication sample but increased to 100% (92%) to detect an effect of 1% (0.5%) or a 2.8 (2.0)-mg/dL difference in HDLC for 25% minor allele frequencies at a 1-sided significance level of 0.002 (assuming 25 independent tests) when the KORA S4 replication sample was added. Because the recent meta-analyses11–13 were based mainly on disease-ascertained subjects, the results of a population-based study in an ethnically homogeneous sample (south German whites) are needed to confirm these findings and extend them by providing an estimated impact on the general population HDL variance of 1.4% to 0.5% per SNP (Table 2).
New Intergenic Regions of Interest
The most important observation of the present study was the fact that strong association signals in CETP, LPL, and LIPG were observed for SNPs in regions hitherto regarded as “intergenic,” up to 70 kb downstream of LIPG and LPL and 10 kb upstream of CETP. This is particularly remarkable because the usual current candidate-gene studies focus on SNPs within the gene ±5 kb. Although the downstream LPL SNPs in the present study showed some correlation with previously reported SNPs within the gene, LIPG signals in the present study were completely independent of SNPs within the gene or any SNP reported previously in candidate-gene studies (Figures 3B and 3⇑⇑C).
Although LIPG was an HDLC candidate owing to our literature search criteria, the epidemiological evidence from candidate-gene studies was sparse, and previously reported SNPs within the LIPG gene did not show an association in the present study. This could be due to rare genetic variants that were undetectable by the present GWA approach explaining considerably less than 0.5% of the HDLC variance, but it could also be due to long-range mechanisms of SNPs in intergenic regions that have not been studied to date.
The identified CETP SNP 10 kb upstream was independent of the numerous SNPs reported in candidate-gene studies and was even located in a recombination hot spot, which strongly supports this region as an independent locus (Figure 3⇑⇑A; Data Supplement Figure I). These results are in line with a recent dense genotyping project in the CETP region that showed that SNPs further upstream of the promoter region have an important impact on HDLC levels.14 The present results clearly show that this effect on HDLC is entirely independent of the other SNPs investigated.
A detailed bioinformatical analysis using various tools supported the possible functional relevance of the regions with the strongest GWA association signals. Besides their influence on transcription factor binding sites, we observed effects of some SNPs on putative microRNA binding sites. A tool with a reported accuracy of 94% indicated a potential for yet unknown regulatory elements. Interestingly, a recent study showed strong effects of transcription factor binding sites in intergenic regions on adjacent genes by interfering, respectively competing, with their transcription mechanism without being directly involved in regulation.15 An in-depth functional analysis of these remote-controlled regulatory mechanisms is needed to clearly define which SNP may be functional. A similar example has been reported for chromosomal region 9p21, which was shown to be associated with myocardial infarction and type 2 diabetes mellitus (for review, see Kronenberg16). SNPs with the most significant signals were located >100 kb upstream of the cyclin-dependent kinase inhibitors CDKN2A and CDKN2B, which supports either long-range effects on 1 of these genes or the influence of a gene not yet annotated. This supports the present observation of the relevance of intergenic regions and calls for future functional studies to address this issue.
Consequences for Candidate-Gene Association Studies
The relevance of genetic variation well beyond the gene might have far-reaching consequences for candidate-gene association studies in general. It is conceivable that in the past, plausible candidate genes have been dropped prematurely when no association between intragenic variation and the investigated phenotypes was observed. In particular, the present example of LIPG impressively demonstrates that genetic variation far downstream of the gene can show a pronounced association with a phenotype. In addition, bioinformatical analyses support these regions as potential regions of functional relevance. If these regions are not considered, and the variation located there is not in strong LD with intragenic variation, an association would be missed. It must be determined how far such long-range effects can reach, so that a strategy for the investigation of future candidate-gene regions can be developed. A greater weighting for intragenic regions in previous candidate-gene studies would result in a biased search of phenotype-influencing gene regions and should probably be avoided.
In line with these observations, the GWA approach thus evolves as the “better” candidate gene study, because it enables a much more comprehensive analysis of all known candidate genes, with ad libitum extension beyond gene boundaries and comparability across studies. Previously, candidate-gene studies each investigated different SNPs with different HDLC measurements and often different analysis models, and most of the studies had a strong focus on the promoter and intragenic regions of these genes.
Study Strengths and Limitations
We consider it a major strength of the present study that it is the first general-population–based GWA study on HDLC, which means it has less risk of confounding due to population stratification than ethnically more heterogeneous samples and less risk of loss of power due to medicated participants, as in case-control studies. The present study design also enables the calculation of SNP-explained HDLC variance as it can be expected in the general German population. A great advantage of GWA studies is that they can guide costly and laborious functional studies not only toward new potential gene regions, such as the intergenic regions mentioned here, but also toward specific polymorphisms that may appear to be independent of previously known polymorphisms and that may be of potential interest for functional clarification, such as the CETP rs9989419.
It may be considered a limitation that the effect sizes in the present study are modest, together explaining approximately 2.4% to 4.1% of the variance of HDLC in the general population (Table 2); however, the variants reported here were common, with minor allele frequencies of 10% to 40%, and thus affect a substantial proportion of the population. Furthermore, modulation of HDLC concentrations by certain SNP variants was found in young individuals as frequently as in older individuals, which pinpoints the potential for early prevention. Finally, numerous genetic effects are obviously still undetected given the estimated heritability of HDLC of up to 80%.8,9 The present study and others11–13 appear to indicate that there are numerous small effects, and these may be detected through future meta-analyses of several GWA studies for more common polymorphisms and through analysis of rare variants, as recently demonstrated impressively.17 Because of the population-based nature and ethnic homogeneity of the studies, studies such as the KORA S3/F3 may be of great value for future large GWA meta-analyses, which will be mandatory to identify these small effects. Even if small, any genetic effect will tremendously enhance our knowledge of the mechanisms of HDLC-level modulation. This may also impact future drug development, which may heavily rely on our understanding of the entire HDLC metabolic network to avoid the types of side effects observed recently.7
The present GWA study identified 2 HDLC-relevant regions downstream of LIPG and upstream of CETP that might be of functional relevance. This may draw the attention of future genetic studies to the long-range effects of intergenic regions, which have been neglected so far.
We gratefully acknowledge the technical assistance of Anke Gehringer and Markus Haak and members of the genotyping staff of the Helmholtz Center Munich. We thank all members of the field staffs involved in the MONICA/KORA Augsburg Studies and the Copenhagen City Heart Study. Finally, we express our appreciation to all study participants.
Sources of Funding
This research was funded by grants from the Austrian GEN-AU program “GOLD” to Dr Kronenberg, the German National Genome Research Net to the GSF Institute of Epidemiology, and a specific targeted research project grant from the European Union (FP-2005-LIFESCIHEALTH-6), contract No. 037631, to Drs Tybjærg-Hansen and Frikke-Schmidt. The MONICA/KORA Augsburg studies were financed by the GSF National Research Center for Environment and Health, Neuherberg, Germany, and supported by grants from the German Federal Ministry of Education and Research (BMBF) and the Munich Center of Health Sciences (MC Health) as part of LMUinnovativ.
Kontush A, Chapman MJ. Functionally defective high-density lipoprotein: a new therapeutic target at the crossroads of dyslipidemia, inflammation, and atherosclerosis. Pharmacol Rev. 2006; 58: 342–374.
Lewington S, Whitlock G, Clarke R, Sherliker P, Emberson J, Halsey J, Qizilbash N, Peto R, Collins R. Blood cholesterol and vascular mortality by age, sex, and blood pressure: a meta-analysis of individual data from 61 prospective studies with 55,000 vascular deaths. Lancet. 2007; 370: 1829–1839.
Von Eckardstein A, Nofer JR, Assmann G. High-density lipoproteins and arteriosclerosis: role of cholesterol efflux and reverse cholesterol transport. Arterioscler Thromb Vasc Biol. 2001; 21: 13–27.
Tall AR, Yvan-Charvet L, Wang N. The failure of torcetrapib: was it the molecule or the mechanism? Arterioscler Thromb Vasc Biol. 2007; 27: 257–260.
Perusse L, Rice T, Despres JP, Bergeron J, Province MA, Gagnon J, Leon AS, Rao DC, Skinner JS, Wilmore JH, Bouchard C. Familial resemblance of plasma lipids, lipoproteins and postheparin lipoprotein and hepatic lipases in the HERITAGE Family Study. Arterioscler Thromb Vasc Biol. 1997; 17: 3263–3269.
Willer CJ, Sanna S, Jackson AU, Scuteri A, Bonnycastle LL, Clarke R, Heath SC, Timpson NJ, Najjar SS, Stringham HM, Strait J, Duren WL, Maschio A, Busonero F, Mulas A, Albai G, Swift AJ, Morken MA, Narisu N, Bennett D, Parish S, Shen H, Galan P, Meneton P, Hercberg S, Zelenika D, Chen WM, Li Y, Scott LJ, Scheet PA, Sundvall J, Watanabe RM, Nagaraja R, Ebrahim S, Lawlor DA, Ben-Shlomo Y, Davey-Smith G, Shuldiner AR, Collins R, Bergman RN, Uda M, Tuomilehto J, Cao A, Collins FS, Lakatta E, Lathrop GM, Boehnke M, Schlessinger D, Mohlke KL, Abecasis GR. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat Genet. 2008; 40: 161–169.
Kathiresan S, Melander O, Guiducci C, Surti A, Burtt NP, Rieder MJ, Cooper GM, Roos C, Voight BF, Havulinna AS, Wahlstrand B, Hedner T, Corella D, Tai ES, Ordovas JM, Berglund G, Vartiainen E, Jousilahti P, Hedblad B, Taskinen MR, Newton-Cheh C, Salomaa V, Peltonen L, Groop L, Altshuler DM, Orho-Melander M. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat Genet. 2008; 40: 189–197.
Wallace C, Newhouse SJ, Braund P, Zhang F, Tobin M, Falchi M, Ahmadi K, Dobson RJ, Marcano AC, Hajat C, Burton P, Deloukas P, Brown M, Connell JM, Dominiczak A, Lathrop GM, Webster J, Farrall M, Spector T, Samani NJ, Caulfield MJ, Munroe PB. Genome-wide association study identifies genes for biomarkers of cardiovascular disease: serum urate and dyslipidemia. Am J Hum Genet. 2008; 82: 139–149.
Thompson JF, Wood LS, Pickering EH, Dechairo B, Hyde CL. High-density genotyping and functional SNP localization in the CETP gene. J Lipid Res. 2007; 48: 434–443.
De Gobbi M, Viprakasit V, Hughes JR, Fisher C, Buckle VJ, Ayyub H, Gibbons RJ, Vernimmen D, Yoshinaga Y, de Jong P, Cheng JF, Rubin EM, Wood WG, Bowden D, Higgs DR. A regulatory SNP causes a human genetic disease by creating a new transcriptional promoter. Science. 2006; 312: 1215–1217.
Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, Hobbs HH. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004; 305: 869–872.
High-density lipoprotein cholesterol (HDLC) is a strong risk factor for atherosclerosis and is assumed to be under considerable genetic control. We applied the hypothesis-free approach of a genome-wide association analysis to identify gene regions that influence HDLC levels. On the basis of 2 genome-wide association scans for HDLC that included a total of 4274 subjects and subsequent replication of the highest-scoring SNPs in a further population-based sample of 4037 subjects, we identified an HDLC-relevant gene region 40 to 70 kb downstream of the LIPG gene not yet indicated by any candidate-gene studies. Furthermore, we pinpointed the potential functional relevance of a polymorphism approximately 10 kb upstream of the CETP gene, which was associated with HDLC concentrations independently of the numerous already reported CETP SNPs. Although CETP and, to a lesser extent, LIPG are 2 well-described genes that modulate HDLC in the general population, our results draw attention to the importance of independent intergenic regions, which has been underestimated to date. The relevance of genetic variation far outside the gene might have far-reaching consequences for candidate-gene association studies in general. It is conceivable that plausible candidate genes have been dropped prematurely in the past when no association between intragenic variation and the investigated phenotypes was observed. Recent studies show that regulatory regions for particular genes can sometimes be megabases away from the studied gene, and our bioinformatical analysis suggested several regions with high regulatory potential in the upstream and downstream regions of the mentioned genes. Therefore, a genome-wide association approach evolves as the “better” candidate-gene study, because it enables a much more comprehensive analysis with an ad libitum extension beyond gene boundaries.
The online-only Data Supplement is available with this article at http://circgenetics.ahajournals.org/cgi/content/full/1/1/10/DC1.
The authors wish to dedicate this work to Professor Gerd Utermann. We consider him one of the pioneers in the investigation of genetic variability in lipoprotein metabolism since he described the apolipoprotein E polymorphism as well as the kringle-IV repeat polymorphism of the apolipoprotein (a) gene.