Sunday, February 5, 2017

Introductory guide to genetic association analysis now available

P-values. Odds scores and betas. GWAS. Linkage disequilibrium. What does it all mean?

Human geneticists are, of course, intimately familiar with these concepts. But for people who are not human geneticists, just getting past the terminology can be frustrating. So we’ve written a basic primer and reference guide that can help users of the T2D Knowledge Portal understand the information presented in our interfaces and tools.

Our Introduction to genetic association analysis guide is available from our Resources page. Or download it here (PDF).

This guide provides a basic introduction to the rationale behind applying human genetic association studies to complex diseases like T2D, explains some of the parameters of genetic associations such as p-values and odds ratios, and describes the different types of experiment used to determine genetic associations.

Many thanks to Andrew Morris, University of Oxford, for his thoughtful review and helpful comments on this guide.

We would be happy to hear your suggestions for improvements and additions!

Monday, January 23, 2017

Insulin Sensitivity Index data added to the Portal

The loss of sensitivity to insulin, often termed insulin resistance, is characteristic of type 2 diabetes. Since this sensitivity is difficult to measure directly, researchers have developed an index that reflects it: the modified Stumvoll Insulin Sensitivity Index (ISI). The index is derived by a formula that combines fasting insulin levels with glucose and insulin levels measured two hours after a glucose load.

Now, the results of a study of genetic associations of variants with ISI are available in the T2D Knowledge Portal. These results are from a recent paper in Diabetes by co-first authors Geoffrey Walford, Stefan Gustafsson, Denis Rybin, and fellow members of the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC). (For an overview of the results, see our blog post about the paper.)

In this study, ISI was calculated for 16,753 non-diabetic individuals, and associations of their variants with ISI values were analyzed. The associations were adjusted in one of three ways: for age and sex; for age, sex, and body mass index (BMI); or according to a model that analyzed the combined influence of the genotype effect adjusted for BMI and the interaction effect between the genotype and BMI on ISI. More details about this data set and others from MAGIC may be found on our Data page.

ISI associations are a subset of the MAGIC GWAS data set. They may be viewed in the Portal by selecting one of these phenotypes:
  • ISI adjusted for age-sex
  • ISI adjusted for age-sex-BMI
  • ISI adjusted for genotype-BMI interaction
Associations with these phenotypes can be found in these locations on Portal pages:
  • On Gene Pages (see an example) in the Variants & Associations table
  • On Variant Pages (see an example) in the Associations at a glance section and in the Association statistics across traits table
  • Via the Variant Finder tool, for the phenotypes listed above
  • A "Manhattan plot" of associations across the genome may be seen by selecting one of the phenotypes listed above in the View full genetic association results for a phenotype scroll box on the Portal home page.

Thursday, January 19, 2017

CAMP GWAS data set moves to Early Access Phase 2

Three months ago, we incorporated a data set from the MGH Cardiology and Metabolic Patient Cohort (CAMP) into the T2D Knowledge Portal. These data were contributed by Pfizer, Inc. as part of a public-private partnership to generate genotype data for a cardiometabolic and prediabetic cohort; they add individual-level genetic association data for type 2 diabetes (T2D), fasting glucose levels, and fasting insulin levels from more than 3,500 samples to the Portal knowledgebase. Now, the CAMP GWAS data set has transitioned to Early Access Phase 2 status in the Portal.

The CAMP GWAS data set was the first to be included in the Portal with “Early Access” status, which is assigned to new data. As described on our Policies page, all newly added data sets have Early Access status for the first six months that they are in the Portal. In the first three months, Phase 1 of the Early Access period, the data have undergone quality control checks but they are not considered to be in their final form. The purpose of Phase 1 is to allow Portal users to review and analyze the data in order to identify any potential problems or areas needing further analysis. After this three-month period, data sets move to Phase 2, indicating that the data are in final form and are fully integrated into the Portal.

Portal users must not submit manuscripts concerning new data until both Phase 1 and Phase 2 of the Early Access period have passed, and any results of analyses or proposed publications are subject to the "Fort Lauderdale Principles" articulated for the sharing of genomic data.

In three months, the CAMP GWAS data set will become Open Access, meaning that it may be freely used for research as long as Portal users comply with our guidelines on user responsibilities and proper citation. It is important to note that in order to protect patient privacy, individual-level data in the Portal are never directly accessible to users. Rather, the Portal makes available summary statistics derived from the data, and also provides tools (such as the Genetic Association Interactive Tool (GAIT) and the Interactive Burden Test) that allow users to perform custom analyses based on individual-level data while protecting the security and privacy of those data.

Find CAMP data at all of these locations in the Portal:

  • On Gene Pages (e.g.,  HLA-C) in the Variants & Associations table.
  • On Variant Pages (e.g., rs9468919) in the Associations at a glance section and in the Association statistics across traits table.
  • Via the Variant Finder tool, for the phenotypes T2D, fasting glucose, and fasting insulin.
  • Via the Genetic Association Interactive Tool (GAIT), which enables custom association analysis for either single variants (available on Variant Pages) or for the set of variants in and near a gene (Interactive burden test, available on Gene Pages).
  • A "Manhattan plot" of genetic associations across the genome may be accessed by selecting the phenotype T2D, fasting glucose, or fasting insulin in the "View full genetic association results for a phenotype" selection box on our home page, and then choosing the CAMP GWAS data set.

Find many more details about the CAMP GWAS data set on our Data page, or read a summary in this blog post.

Tuesday, January 17, 2017

New Year, New Data: BioMe AMP T2D GWAS

We’re happy to announce the first addition of data to the Type 2 Diabetes Knowledge Portal in 2017: the BioMe AMP T2D GWAS data set. The generation of these data was funded by the Accelerating Medicines Partnership in Type 2 Diabetes (AMP T2D), a collaboration between multiple stakeholders that aims to catalyze the clinical translation of genetic discoveries by producing and aggregating data, developing and implementing novel analytical methods and tools, and building infrastructure for data storage and presentation.

The BioMe AMP T2D GWAS data set is the first set to be entirely produced by the AMP T2D project, which supplied the funding and carried out every step of its production, from data generation to analysis, quality control, and presentation. Its immediate availability in the Portal, prior to publication, fulfills the mission of AMP T2D to speed up access to and utilization of new data.

These data were generated at the Charles Bronfman Institute for Personalized Medicine BioMe BioBank, a biorepository located at the Mount Sinai Medical Center (MSMC) in the upper Manhattan area of New York City. MSMC serves a diverse population of over 800,000 outpatients each year. Importantly, since many BioMe participants are African American or Hispanic Latino, this data set adds significant ethnic diversity to the Portal’s genetic association data.

The BioMe AMP T2D GWAS data set is comprised of about 13,000 unique individuals, 41.5% of whom are admixed American, 38% African American, and 20% European. Subjects were genotyped using at least one of three platforms: the Illumina Exome Array, the Illumina GWAS array, or the Affymetrix GWAS array. Their T2D status was assessed by an algorithm, and many additional traits were also measured.

The data were subjected to quality control and association analysis by the Analysis Team at the AMP Data Coordinating Center (DCC) at the Broad Institute. Variant associations with T2D, fasting glucose levels, and HbA1c levels were analyzed. The top results included both previously known and novel variants, with only a single variant reaching genome-wide significance: T2D association of the variant rs7903146, within the well-established T2D risk gene TCF7L2. Now that these results are available in the T2D Knowledge Portal, the ability to analyze them further in the context of all other available T2D association data may lead to additional insights.

The BioMe AMP T2D GWAS data currently has the “Early Access Phase 1” status that is assigned to new data. This status denotes that although analysis and quality control checks have been performed, the data are not yet considered to be in their final state. During the early access period, users may analyze the data but may not submit the results of these analyses for publication. Find the full details about the different phases of data release on our Policies page. More information about the data set, along with links to download even more detailed reports on its quality control and analysis, may be found in the BioMe AMP T2D GWAS section of our Data page.

BioMe AMP T2D GWAS data are available at these locations in the Portal:

  • On Gene Pages (see an example) in the Variants & Associations table and the Minor allele frequencies across data sets table
  • On Variant Pages  (see an example) in the Associations at a glance section and in the Association statistics across traits table
  • Via the Variant Finder tool, for these phenotypes: type 2 diabetes; fasting glucose adjusted for age and sex; HbA1c adjusted for age and sex; and HbA1c adjusted for age, sex, and body mass index
  • A "Manhattan plot" of associations across the genome may be seen by selecting one of the phenotypes above in the View full genetic association results for a phenotype scroll box on the Portal home page, and then selecting the BioMe AMP T2D GWAS data set.

As always, please contact us with any questions, comments, or suggestions.

Thursday, November 17, 2016

Collaborate with us!

One of the goals of the Type 2 Diabetes Knowledge Portal project is to bring together the world-wide T2D and genetics research communities to share data, knowledge, methods, and tools. In keeping with that goal, we welcome contributions of data to the Portal and we are also open to collaboration as we develop new and better ways to analyze and display data.

We’ve added a new page to the Portal, "Collaborate," that answers frequently asked questions about how to get involved. It includes links to our Data Submitter’s Guide and Data Transfer Agreement, gives an overview of the kinds of data we’re looking for, and tells you how to get in touch with our team.

The “Collaborate” page also links to information about funding opportunities offered by the Foundation for the NIH. Check this out if you’re interested in starting a new project to generate data for the Portal!

Monday, November 7, 2016

New MGH Cardiology and Metabolic Patient Cohort data in the T2D Knowledge Portal

We are pleased to announce a new data set in the T2D Knowledge Portal, from the MGH Cardiology and Metabolic Patient Cohort (CAMP). These data were contributed by Pfizer, Inc. as part of a public-private partnership to generate genotype data for a cardiometabolic and prediabetic cohort. This data set adds individual-level genetic association data for type 2 diabetes (T2D), fasting glucose levels, and fasting insulin levels from more than 3,500 samples to the Portal knowledgebase. Association data for additional phenotypes from this cohort will be incorporated in the future.

The inclusion of this data set in the T2D Knowledge Portal illustrates the uniqueness of the Accelerating Medicines Partnership, which brings together pharmaceutical companies and non-profit institutions with the goal of speeding up the discovery of new targets for treatment of T2D. The pharmaceutical partners in this collaboration have committed not only to providing funding, but also to sharing the data they generate. The CAMP data set contributed by Pfizer is the first set from a pharmaceutical partner to be made available in the Portal.

Another unique aspect of this data set is that it is the first to be included in the Portal with “Early Access Phase 1” status, which is assigned to new data. This status denotes that although analysis and quality control checks have been performed, the data are not yet considered to be in their final state. During the early access period, users may analyze the data but may not submit the results of these analyses for publication. Find the full details about the different phases of data release on our Policies page.

The CAMP cohort consists of 3,857 subjects who were recruited at the Massachusetts General Hospital Heart Center between 2008 and 2012. In addition to genotyping, the subjects had either vascular reactivity measurements (for T2D patients) or an oral glucose tolerance test (for patients not known to have T2D), and samples of their plasma and serum were analyzed. Most of the subjects were of European ancestry; about 10% were African American.

The analysis and quality control processes for this data set were performed by the Analysis Team of the Accelerating Medicines Partnership Data Coordinating Center (AMP-DCC) at the Broad Institute, and are completely transparent and fully documented. The experiment design and analysis are summarized on our Data page, and detailed reports are available for download. Going forward, all new data sets added to the Portal will be fully documented in this manner.

One intriguing—and somewhat puzzling—result from the analysis highlights the utility of incorporating data sets like this one into the Portal. The variant most strongly associated with T2D (at genome-wide significance) in this set is located in the major histocompatibility complex region near the HLA-C gene.

Known associations of genes in this region with type 1 diabetes, along with a high local recombination rate, make it challenging to interpret the meaning of this association. However, it certainly merits further investigation because of its genome-wide significance. The inclusion of this data set in the Portal, in the context of all other available data about T2D associations in the region, greatly facilitates the further analysis of this and other associations in the set.

The CAMP data may be accessed via multiple interfaces in the Portal. They are shown in tables of summary statistics and accessible in variant searches using the Variant Finder. Importantly, since the data are individual-level, samples may be filtered by various parameters and used for custom association analysis in our Genetic Association Interactive Tool (GAIT).

Find CAMP data at all of these locations in the Portal:

On Gene Pages (e.g.,  HLA-C) in the Variants & Associations table.
On Variant Pages (e.g., rs9468919) in the Associations at a glance section and in the Association statistics across traits table.
Via the Variant Finder tool, for the phenotypes T2D, fasting glucose, and fasting insulin.
Via the Genetic Association Interactive Tool (GAIT), which enables custom association analysis for either single variants (available on Variant Pages) or for the set of variants in and near a gene (Interactive burden test, available on Gene Pages).

Tuesday, November 1, 2016

View ASHG posters from the T2D Knowledge Portal team

Did you miss the American Society of Human Genetics 2016 Annual Meeting last month in Vancouver? Or did you attend, but weren’t able to get to our posters among the hundreds that were there? 


Now you can catch up on everything you missed from the Portal team. We’ve uploaded our posters to the open access publishing platform F1000Research, where you can view or download them. The Portal team presented four posters:


1. Automated, scalable quality control of heterogeneous exome sequence data. This poster presented by Ryan Koesterer, a member of the Analysis Team of the Accelerating Medicines Partnership Data Coordinating Center (AMP-DCC), describes a new, scalable method for quality control for exome sequence data, applied to data before they are incorporated into the Portal. 

Citation: Koesterer R, von Grotthuss M, Flannick J et al. Automated, scalable quality control of heterogeneous exome sequence data [v1; not peer reviewed]. F1000Research 2016, 5:2609 (poster) (doi: 10.7490/f1000research.1113354.1)


2. The Type 2 Diabetes Knowledge Portal: a paradigm for the democratization of human genetic information. This poster, from Portal content and community manager Maria Costanzo, presents an introduction to the Portal: its purpose, its content, and what kinds of questions it allows you to ask.


Citation: Costanzo MC and Accelerating Medicines Partnership: Type 2 Diabetes. The type 2 diabetes knowledge portal: a paradigm for the democratization of human genetic information [v1; not peer reviewed]. F1000Research 2016, 5:2607 (poster) (doi: 10.7490/f1000research.1113352.1)


3. A software platform facilitating community analyses of genetic datasets for complex disease. This poster from Benjamin Alexander, on the Portal software engineering team, describes the tools in the Portal that allow you to do both forward and reverse genetic analysis and even perform custom association analysis.

Citation: Alexander B, Duby M, Sanders M et al. A software platform facilitating community analyses of genetic datasets for complex disease [v1; not peer reviewed]. F1000Research 2016, 5:2608 (poster) (doi: 10.7490/f1000research.1113353.1)


4. Mapping variants to amino-acid changes in three-dimensional protein space improves aggregate association test power and suggests mechanisms of action. This poster, presented by Portal computational biologist Marcin von Grotthuss, illustrates a new method for evaluating the significance of variants by considering the protein structural context of the amino acids they encode. A long-term goal is to incorporate this analysis into the Portal.


von Grotthuss M, Florez JC, Flannick J et al. Mapping variants to amino-acid changes in three-dimensional protein space improves aggregate association test power and suggests mechanisms of action [v1; not peer reviewed]. F1000Research 2016, 5:2610 (poster) (doi: 10.7490/f1000research.1113355.1)


We hope you find these posters informative! Please let us know if you have any questions or suggestions.