A paper was published in JAMA a few days ago (Paynter et al, Association Between a Literature-Based Genetic Risk Score and Cardiovascular Events in Women) showing that a genetic score based on 101 CVD related SNPs was not effective as a screening tool. So what happened to the promise of GWAS? One or two SNPs are not enough but we were told that when we get to panels of 30, 40 or more then we would see something real and useful, apparently we are not, so is it over? Is the fate for GWAS that of the candidate gene studies where initial positives were generally shown to be chance?
Maybe not. But maybe this study is proving, just like with candidate gene studies, that it is naïve to think that “blind” association studies of genes with complex diseases will provide useful results – these are diseases where the causes are many-fold involving genetics, behaviour and environment, but where in general just the disease end-point and the genetics are actually measured.
The Paynter paper used a collection of SNPs chosen from the NHGRI database that had an association with either:
“…cardiovascular disease (myocardial infarction [MI], stroke, coronary disease, and/or cardiovascular death) or an intermediate phenotype (total cholesterol, high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, blood pressure, diabetes, hemoglobin A1c or fasting blood glucose, and high-sensitivity C-reactive protein)…”
Wow, that’s quite a heterogeneous mixture of associations and maybe is one of the problems. They calculated the genetic score and compared it to the endpoint: “incident MI, ischemic stroke, coronary revascularization, and cardiovascular deaths, which were combined to calculate total cardiovascular disease.”
Again, a lot of heterogeneity – are we asking too much? CVD is not just one disease with one cause, it’s a constellation of diseases involving some or all of inflammation, dislipidemia, hypertension, etc. If we took it to the next extreme we could say why not add all of the type 2 diabetes related SNPs and then see if we have a genetic predictor of the disease “cardiovascular diabetes” – and what would that cover, about 60-70% of the population at risk?
So a likely problem is in the initial GWAS studies that use not very precise endpoints to discover “blind” (as in hypothesis free) associations and yield a very imprecise and apparently useless screening tool. Another “problem” is that most of the SNPs are related to intermediate phenotypes (lipid levels, hypertension, inflammation etc) and so of course the genetic score is not going to measure a risk that is “over and above” the contribution of traditional risk factors and family history. The CTFR mutation in cystic fibrosis will not contribute much “over and above” the sweat test as a predictor of the disease – this doesn’t mean that genetics have no value - no help (yet) for CTFR of course but obvious that it would be nice to have a predictor that didn’t actually depend on having raised risk levels).
But apart from the poorly focussed heterogeneous endpoints, and therefore associations, from GWAS, one major missing feature of almost all studies is the effect of environment/behaviour on the associations, generally this means assessing diet and lifestyle, which is very difficult. But is this where some of the “missing” heritablity lies? When you are looking at a disease which affects 40% percent of the population it means that a lot more than 40% will be “genetically” predisposed but did not get the disease because of a healthier lifestyle. This makes finding a control group for the initial GWAS rather tricky – are the healthy controls healthy because they don’t have the risk alleles or because they have different lifestyles/environment? Almost certainly a mixture of both, but one is usually ignored resulting in dampened down low ORs and probably some missed associations.
The other possibility is one which is becoming more fashionable right now – the many rare variants proposal. It’s attractive from a certain point of view – nice to get the multimillion $ grants to buy the latest sequencing technology and play away – but it is still hard to imagine getting the right answer without environment/behaviour assessment. Smoking is always assessed in cancer studies, it’s much easier to do so and no-one would think of looking for cancers genes without the smoking assessment, so why is it OK with CVD and diabetes?
Of the much maligned candidate gene studies those that have held up have been those where the gene-environment interaction was assessed. A good example is SOD2 rs4880 SNP and breast/prostate cancer. An early publication linked the Ala allele to breast cancer but subsequent publications were contradictory (table 1), however ALL studies which looked at gene + disease + diet gave the same results – Ala allele was linked to increased risk when antioxidant intake was low (table 2).
Table 1. SOD2 Ala-Val allele association with cancer
* However, a proper evaluation of this polymorphism with cancer link demands experiments involving large sample size, cross-tabulation of gene-gene, gene-environment interactions (author comment)
Table 2. SOD2 Ala-Val: Gene x Environment association with cancer
So we see with a diabetes panel in a recent publication (Qi et. al., 2009, Genetic predisposition, Western dietary pattern, and the risk of type 2 diabetes in men) – they used a panel of GWAS SNPs and assessed the impact of diet on the risk score – the high risk score together with poor diet lead to much higher risk ratios:
We’re getting quite impatient – at Eurogene we would like to start using effective panels. The commercial offerings are interesting but not really that useful in the clinic, at least for our needs. We are looking forward to GWAS studies with precise goals (narrowly defined end points) which take into consideration lifestyle/environment and will lead to useful tools because of course by establishing the gene-environment interactions it is much easier to understand what are the risk-lowering interventions – much more useful than just to be told you have a higher risk of heart disease or diabetes. These studies WILL lead to genetic predictive tools that will be useful in clinical decision making, especially for prevention. I’m confident they will be coming along also because I know that one of the groups actively researching this area is that of Jose Ordovas and Larry Parnell from Tufts – a group that has discovered (and continues to discover/confirm) many gene-environment associations in candidate genes which have stood the test of time. They are already starting to publish and I’m sure many more will be coming (no pressure then Larry!) – keep up with the news on twitter and via Larry’s blog (@larry_parnell and “Variable Genome”)
Maybe not. But maybe this study is proving, just like with candidate gene studies, that it is naïve to think that “blind” association studies of genes with complex diseases will provide useful results – these are diseases where the causes are many-fold involving genetics, behaviour and environment, but where in general just the disease end-point and the genetics are actually measured.
The Paynter paper used a collection of SNPs chosen from the NHGRI database that had an association with either:
“…cardiovascular disease (myocardial infarction [MI], stroke, coronary disease, and/or cardiovascular death) or an intermediate phenotype (total cholesterol, high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, blood pressure, diabetes, hemoglobin A1c or fasting blood glucose, and high-sensitivity C-reactive protein)…”
Wow, that’s quite a heterogeneous mixture of associations and maybe is one of the problems. They calculated the genetic score and compared it to the endpoint: “incident MI, ischemic stroke, coronary revascularization, and cardiovascular deaths, which were combined to calculate total cardiovascular disease.”
Again, a lot of heterogeneity – are we asking too much? CVD is not just one disease with one cause, it’s a constellation of diseases involving some or all of inflammation, dislipidemia, hypertension, etc. If we took it to the next extreme we could say why not add all of the type 2 diabetes related SNPs and then see if we have a genetic predictor of the disease “cardiovascular diabetes” – and what would that cover, about 60-70% of the population at risk?
So a likely problem is in the initial GWAS studies that use not very precise endpoints to discover “blind” (as in hypothesis free) associations and yield a very imprecise and apparently useless screening tool. Another “problem” is that most of the SNPs are related to intermediate phenotypes (lipid levels, hypertension, inflammation etc) and so of course the genetic score is not going to measure a risk that is “over and above” the contribution of traditional risk factors and family history. The CTFR mutation in cystic fibrosis will not contribute much “over and above” the sweat test as a predictor of the disease – this doesn’t mean that genetics have no value - no help (yet) for CTFR of course but obvious that it would be nice to have a predictor that didn’t actually depend on having raised risk levels).
But apart from the poorly focussed heterogeneous endpoints, and therefore associations, from GWAS, one major missing feature of almost all studies is the effect of environment/behaviour on the associations, generally this means assessing diet and lifestyle, which is very difficult. But is this where some of the “missing” heritablity lies? When you are looking at a disease which affects 40% percent of the population it means that a lot more than 40% will be “genetically” predisposed but did not get the disease because of a healthier lifestyle. This makes finding a control group for the initial GWAS rather tricky – are the healthy controls healthy because they don’t have the risk alleles or because they have different lifestyles/environment? Almost certainly a mixture of both, but one is usually ignored resulting in dampened down low ORs and probably some missed associations.
The other possibility is one which is becoming more fashionable right now – the many rare variants proposal. It’s attractive from a certain point of view – nice to get the multimillion $ grants to buy the latest sequencing technology and play away – but it is still hard to imagine getting the right answer without environment/behaviour assessment. Smoking is always assessed in cancer studies, it’s much easier to do so and no-one would think of looking for cancers genes without the smoking assessment, so why is it OK with CVD and diabetes?
Of the much maligned candidate gene studies those that have held up have been those where the gene-environment interaction was assessed. A good example is SOD2 rs4880 SNP and breast/prostate cancer. An early publication linked the Ala allele to breast cancer but subsequent publications were contradictory (table 1), however ALL studies which looked at gene + disease + diet gave the same results – Ala allele was linked to increased risk when antioxidant intake was low (table 2).
Table 1. SOD2 Ala-Val allele association with cancer
Ovarian cancer | Ala | Olsen, 2004 |
Lung cancer | Val | Liu 2004 |
Lung cancer | Val | Liu 2004 |
Bladder | Neither | Terry 2005 |
Breast & prostate | Ala | Taufer, 2005 |
Breast | Neither | Cebrian, 2006 |
Breast | Neither | Oestergaard, 2006 |
Lymphoma | Ala | Lightfoot, 2006 |
Breast | Neither | Justenhoven, 2008 |
Brain | Ala | Rajaraman, 2008 |
Breast meta-anal | Neither | Bag, 2008* |
Table 2. SOD2 Ala-Val: Gene x Environment association with cancer
Breast | Ala | Ambrosone, 1999 |
Breast | Ala | Cai, 2004 |
Prostate | Ala | Li, 2007 |
Prostate | Ala | Kang, 2007 |
Prostate | Ala | Mikhak, 2008 |
Prostate | Ala | Cooper, 2008 |
So we see with a diabetes panel in a recent publication (Qi et. al., 2009, Genetic predisposition, Western dietary pattern, and the risk of type 2 diabetes in men) – they used a panel of GWAS SNPs and assessed the impact of diet on the risk score – the high risk score together with poor diet lead to much higher risk ratios:
We’re getting quite impatient – at Eurogene we would like to start using effective panels. The commercial offerings are interesting but not really that useful in the clinic, at least for our needs. We are looking forward to GWAS studies with precise goals (narrowly defined end points) which take into consideration lifestyle/environment and will lead to useful tools because of course by establishing the gene-environment interactions it is much easier to understand what are the risk-lowering interventions – much more useful than just to be told you have a higher risk of heart disease or diabetes. These studies WILL lead to genetic predictive tools that will be useful in clinical decision making, especially for prevention. I’m confident they will be coming along also because I know that one of the groups actively researching this area is that of Jose Ordovas and Larry Parnell from Tufts – a group that has discovered (and continues to discover/confirm) many gene-environment associations in candidate genes which have stood the test of time. They are already starting to publish and I’m sure many more will be coming (no pressure then Larry!) – keep up with the news on twitter and via Larry’s blog (@larry_parnell and “Variable Genome”)
Keith,
ReplyDeleteThis is a brilliant post. I totally agree this is the holy grail of nutrigenomics and genomic medicine. The problem we currently have is that disease is limited to see touch taste feel and mass spec. We still don't have the specifics of the subtype of CAD/HTN/DM2/I could go on and on.....
I was hoping GWAS would sort some of that out, it has in some ways, but if you look for green instead of douglas fir trees you end up finding a whole lot of noise.