Thursday, November 17, 2016

Collaborate with us!

One of the goals of the Type 2 Diabetes Knowledge Portal project is to bring together the world-wide T2D and genetics research communities to share data, knowledge, methods, and tools. In keeping with that goal, we welcome contributions of data to the Portal and we are also open to collaboration as we develop new and better ways to analyze and display data.

We’ve added a new page to the Portal, "Collaborate," that answers frequently asked questions about how to get involved. It includes links to our Data Submitter’s Guide and Data Transfer Agreement, gives an overview of the kinds of data we’re looking for, and tells you how to get in touch with our team.

The “Collaborate” page also links to information about funding opportunities offered by the Foundation for the NIH. Check this out if you’re interested in starting a new project to generate data for the Portal!

Monday, November 7, 2016

New MGH Cardiology and Metabolic Patient Cohort data in the T2D Knowledge Portal

We are pleased to announce a new data set in the T2D Knowledge Portal, from the MGH Cardiology and Metabolic Patient Cohort (CAMP). These data were contributed by Pfizer, Inc. as part of a public-private partnership to generate genotype data for a cardiometabolic and prediabetic cohort. This data set adds individual-level genetic association data for type 2 diabetes (T2D), fasting glucose levels, and fasting insulin levels from more than 3,500 samples to the Portal knowledgebase. Association data for additional phenotypes from this cohort will be incorporated in the future.

The inclusion of this data set in the T2D Knowledge Portal illustrates the uniqueness of the Accelerating Medicines Partnership, which brings together pharmaceutical companies and non-profit institutions with the goal of speeding up the discovery of new targets for treatment of T2D. The pharmaceutical partners in this collaboration have committed not only to providing funding, but also to sharing the data they generate. The CAMP data set contributed by Pfizer is the first set from a pharmaceutical partner to be made available in the Portal.

Another unique aspect of this data set is that it is the first to be included in the Portal with “Early Access Phase 1” status, which is assigned to new data. This status denotes that although analysis and quality control checks have been performed, the data are not yet considered to be in their final state. During the early access period, users may analyze the data but may not submit the results of these analyses for publication. Find the full details about the different phases of data release on our Policies page.

The CAMP cohort consists of 3,857 subjects who were recruited at the Massachusetts General Hospital Heart Center between 2008 and 2012. In addition to genotyping, the subjects had either vascular reactivity measurements (for T2D patients) or an oral glucose tolerance test (for patients not known to have T2D), and samples of their plasma and serum were analyzed. Most of the subjects were of European ancestry; about 10% were African American.

The analysis and quality control processes for this data set were performed by the Analysis Team of the Accelerating Medicines Partnership Data Coordinating Center (AMP-DCC) at the Broad Institute, and are completely transparent and fully documented. The experiment design and analysis are summarized on our Data page, and detailed reports are available for download. Going forward, all new data sets added to the Portal will be fully documented in this manner.

One intriguing—and somewhat puzzling—result from the analysis highlights the utility of incorporating data sets like this one into the Portal. The variant most strongly associated with T2D (at genome-wide significance) in this set is located in the major histocompatibility complex region near the HLA-C gene.

Known associations of genes in this region with type 1 diabetes, along with a high local recombination rate, make it challenging to interpret the meaning of this association. However, it certainly merits further investigation because of its genome-wide significance. The inclusion of this data set in the Portal, in the context of all other available data about T2D associations in the region, greatly facilitates the further analysis of this and other associations in the set.

The CAMP data may be accessed via multiple interfaces in the Portal. They are shown in tables of summary statistics and accessible in variant searches using the Variant Finder. Importantly, since the data are individual-level, samples may be filtered by various parameters and used for custom association analysis in our Genetic Association Interactive Tool (GAIT).

Find CAMP data at all of these locations in the Portal:

On Gene Pages (e.g.,  HLA-C) in the Variants & Associations table.
On Variant Pages (e.g., rs9468919) in the Associations at a glance section and in the Association statistics across traits table.
Via the Variant Finder tool, for the phenotypes T2D, fasting glucose, and fasting insulin.
Via the Genetic Association Interactive Tool (GAIT), which enables custom association analysis for either single variants (available on Variant Pages) or for the set of variants in and near a gene (Interactive burden test, available on Gene Pages).

Tuesday, November 1, 2016

View ASHG posters from the T2D Knowledge Portal team

Did you miss the American Society of Human Genetics 2016 Annual Meeting last month in Vancouver? Or did you attend, but weren’t able to get to our posters among the hundreds that were there? 


Now you can catch up on everything you missed from the Portal team. We’ve uploaded our posters to the open access publishing platform F1000Research, where you can view or download them. The Portal team presented four posters:


1. Automated, scalable quality control of heterogeneous exome sequence data. This poster presented by Ryan Koesterer, a member of the Analysis Team of the Accelerating Medicines Partnership Data Coordinating Center (AMP-DCC), describes a new, scalable method for quality control for exome sequence data, applied to data before they are incorporated into the Portal. 

Citation: Koesterer R, von Grotthuss M, Flannick J et al. Automated, scalable quality control of heterogeneous exome sequence data [v1; not peer reviewed]. F1000Research 2016, 5:2609 (poster) (doi: 10.7490/f1000research.1113354.1)


2. The Type 2 Diabetes Knowledge Portal: a paradigm for the democratization of human genetic information. This poster, from Portal content and community manager Maria Costanzo, presents an introduction to the Portal: its purpose, its content, and what kinds of questions it allows you to ask.


Citation: Costanzo MC and Accelerating Medicines Partnership: Type 2 Diabetes. The type 2 diabetes knowledge portal: a paradigm for the democratization of human genetic information [v1; not peer reviewed]. F1000Research 2016, 5:2607 (poster) (doi: 10.7490/f1000research.1113352.1)


3. A software platform facilitating community analyses of genetic datasets for complex disease. This poster from Benjamin Alexander, on the Portal software engineering team, describes the tools in the Portal that allow you to do both forward and reverse genetic analysis and even perform custom association analysis.

Citation: Alexander B, Duby M, Sanders M et al. A software platform facilitating community analyses of genetic datasets for complex disease [v1; not peer reviewed]. F1000Research 2016, 5:2608 (poster) (doi: 10.7490/f1000research.1113353.1)


4. Mapping variants to amino-acid changes in three-dimensional protein space improves aggregate association test power and suggests mechanisms of action. This poster, presented by Portal computational biologist Marcin von Grotthuss, illustrates a new method for evaluating the significance of variants by considering the protein structural context of the amino acids they encode. A long-term goal is to incorporate this analysis into the Portal.


von Grotthuss M, Florez JC, Flannick J et al. Mapping variants to amino-acid changes in three-dimensional protein space improves aggregate association test power and suggests mechanisms of action [v1; not peer reviewed]. F1000Research 2016, 5:2610 (poster) (doi: 10.7490/f1000research.1113355.1)


We hope you find these posters informative! Please let us know if you have any questions or suggestions.

Tuesday, October 25, 2016

Design your own association analysis with our Genetic Association Interactive Tool (GAIT)

Genetic association analysis—identifying polymorphisms in the human genome that are correlated with altered risk of disease—is a powerful method for discovering disease mechanisms. These polymorphisms can indicate what goes wrong at the cellular level in the disease process, knowledge that is critically important for developing better diagnostics and therapies.

The Type 2 Diabetes Knowledge Portal offers a wealth of pre-calculated information on genetic associations between variants and type 2 diabetes (T2D) or other related traits. These results are computed using broadly defined groups of samples: either an entire sample set from a project, or ancestry-specific cohorts. This approach, while it generates very valuable results, masks effects that could only be detected in even more narrowly defined groups: for example, individuals within a certain range of age, body mass index, or cholesterol level. 

Until now, analysis of such fine-grained subsets of individual-level data has only been possible for expert geneticists with access to protected data. But our new Genetic Association Interactive Tool (GAIT) offers everyone an unprecedented amount of access to individual-level data along with an easy-to-use interface for analyzing genetic associations using custom subsets of samples and variants.

Two versions of GAIT are available in the Portal. One, on Variant pages (see an example) computes association statistics for the single variant featured on that page. The other, accessible on Gene pages (see an example) powers an interactive burden test that considers the collection of variants in or near a gene, or a selected subset of those variants. 

Where to find GAIT on Gene pages (left) and Variant pages (right)


The GAIT interface offers incredible flexibility for designing custom analyses. In the interactive burden test, you can filter variants by their predicted effects, or pick and choose individual variants to include. When creating sample sets for either single-variant association analysis or a gene burden test, you can specify a gender, set ranges for the values for multiple phenotypes, and choose principal components or phenotypes to use as covariates. And all these parameters may be set differently for different ethnic groups.

The GAIT interface displays phenotype values within the sample set and allows you to filter samples by multiple criteria


Once you set parameters of your choice, GAIT computes associations on the fly, based on individual-level data. To protect patient confidentiality, GAIT will not display results from sample sets consisting of fewer than 100 individuals.

To help you get familiar with this versatile tool, we’ve created a User Guide (download PDF) that summarizes all the details of the interface. Please give GAIT a try and let us know what you think!



Tuesday, October 18, 2016

New and updated data in the T2D Knowledge Portal

As members of the T2D Knowledge Portal team arrive in Vancouver for the American Society of Human Genetics meeting, we are pleased to announce that we have added a new data set to the Portal and made extensive updates to existing data sets. 

The new data set, named “CAMP GWAS” in the Portal, comes from the MGH Cardiology and Metabolic Patient Cohort (CAMP). These data were contributed by Pfizer, Inc. as part of a public-private partnership to generate genotype data for a cardiometabolic and prediabetic cohort, and were analyzed by the Analysis Team of the Accelerating Medicines Partnership Data Coordinating Center (AMP-DCC) at the Broad Institute. The set adds individual-level genetic association data for type 2 diabetes (T2D), fasting glucose levels, and fasting insulin levels from nearly 3,500 samples to the Portal knowledgebase, and association data for more phenotypes will be added in the future.

CAMP data may be accessed on Gene and Variant pages in the Portal and via the Variant Finder, and may also be filtered and queried using the Genetic Association Interactive tool (GAIT).

Several other data sets in the Portal have been updated and improved:
  • The size of the CARDIoGRAM GWAS data set has nearly doubled, now consisting of 184,305 samples, and the data analysis has been updated.
  • The size of the CKDGen GWAS data set has also nearly doubled, to 133,814 samples; the data analysis has been updated; new subsets have been added that stratify serum creatinine associations by African American ancestry and stratify both serum creatinine and urinary albumin-to-creatinine ratio by the presence or absence of T2D.
  • The data set previously named “DIAGRAM GWAS” in the Portal has been updated and re-named “DIAGRAM Trans-ethnic meta-analysis;” its sample size has increased to 149,821. Several new subsets have been added, including gender-stratified, MetaboChip, and fine mapping data.
  • The GIANT GWAS data have been updated and European cohorts have been added for BMI and height traits.
  • The GLGC GWAS data set has increased in size to 188,577 samples and has been updated.
  • The number of samples in the MAGIC GWAS dataset has more than doubled, to 133,010; the data have been updated, and associations with 2 hour glucose, fasting glucose, and fasting insulin have been added for MetaboChip data.
Full details about all of these data sets are available on our Data page.

Because of compatibility issues with the updated data, we have temporarily removed the “GWAS results summary” section from Gene pages of the Portal. This feature will be restored within the next week.

As always with major updates, issues or bugs may have been introduced and we may not have found all of them during our routine testing. We encourage you to let us know of any problems that you encounter in using the Portal, and we welcome your questions and suggestions.

Friday, October 14, 2016

See you at ASHG 2016!

Members of the Type 2 Diabetes Knowledge Portal team will be attending the American Society of Human Genetics meeting next week in Vancouver, BC. You can catch us nearly every day of the meeting:

Tuesday 10/18

3 PM: Nöel Burtt will be one of the speakers in an informational session on the T2D Knowledge Portal and new funding opportunities offered by the Foundation for the NIH. Complimentary snacks, beer, and wine will be served! Please pre-register here.

Wednesday 10/19

10 AM - 4 PM: Find us in the exhibit hall at booth #428. We’ll be there to answer your questions and give tours and tutorials on the Portal.

Thursday 10/20

10 AM - 4 PM: We will again be in the exhibit hall at booth #428.

2 - 3 PM: Ryan Koesterer will present his poster on an automatic, scaleable quality control method for genetic association data that improves on current “gold-standard” methods (program #1943T).

2 - 3 PM: Maria Costanzo will present her poster giving an overview of data in the Portal and the global collaborative efforts behind its aggregation (program #329T).

Friday 10/21

10 AM - 4 PM: This is our last day in the exhibit hall at booth #428.

2 - 3 PM: Marcin von Grotthuss will present his poster on improving predictions of significant variants by taking protein structure into account (program #489F).

3 - 4 PM: Ben Alexander will present his poster on the software platform that powers the T2D Knowledge Portal user interface and custom analysis tools (program #1650F).

T2D Knowledge Portal staff attending ASHG

We look forward to meeting you at ASHG! If you have questions and cannot meet us any of these times, or if you won’t be at ASHG, our mailbox is always open at help@type2diabetesgenetics.org.

Monday, October 3, 2016

Come to a T2D Knowledge Portal information session at ASHG

The American Society of Human Genetics meeting is happening in Vancouver, B.C. in a little over two weeks! The Portal team will be presenting and exhibiting at multiple venues at ASHG, and the first event will take place immediately before the conference starts: an information session including an overview of the Accelerating Medicines Partnership in Type 2 Diabetes, a progress update on T2D Knowledge Portal functionality, and information on new funding opportunities. Complimentary hors d'oeuvres, beer and wine will be served!

Information session
Tuesday - October 18, 2016
3:00 pm - 4:00 pm PDT
Fairmont Waterfront

900 Canada Place Way

Vancouver, British Columbia


Please register here for this free event, hosted by FNIH.  Contact Nicole Spear at Nspear@fnih.org with any questions.

Watch this space over the next two weeks for a complete listing of opportunities to learn about the Portal and talk with the Portal team at ASHG!