Monday, October 15, 2018

Connect with the Knowledge Portal Network team at ASHG!

This week, the human genetics research community will come together in San Diego for one of the most important conferences of the year: the annual American Society of Human Genetics meeting. The Knowledge Portal Network team will be there, and in addition to presenting all the new data and features in the Type 2 DiabetesCerebrovascular Disease, and Cardiovascular Disease Knowledge Portals (KPs), we're launching an entirely new Portal: the Sleep Disorder Knowledge Portal, for the genetics of sleep and circadian traits.

We'll also present an interactive workshop on Friday that will go over the basics of navigating the Knowledge Portal Network. Download the flyer here, and find more details below.

Here's the schedule of events for the week:

Tuesday, October 16
2:05-2:30 pm: Jason Flannick will present a talk, "Infrastructure for analyzing and disseminating large-scale genetic data for type 2 diabetes and other complex diseases," in the ASHG/IGES/ISCB Joint Symposium.
Room 6C - Upper Level/San Diego Convention Center

Wednesday, October 17
The Knowledge Portal team will be at our booth, #219, in the exhibit hall from 10am-4:30pm.
We'll also be at the Broad Institute Genomic Services booth, #1634, from 10:30-11:30am.
At 2:30pm, Richa Saxena, the P.I. for the Sleep Disorder Knowledge Portal, will be at our booth to talk about the SDKP.

Thursday, October 18
The team will again be at our booth, #219, in the exhibit hall from 10am-4:30pm.

Friday, October 19
We'll again be at our booth, #219, in the exhibit hall from 10am- 4:30pm, but today the booth will be closed around lunchtime so that we can present a special tutorial session on the Knowledge Portals. See details and sign up below. After the session, we'll be back at our booth until 4:30pm and will also be at the Broad Institute Genomic Services booth, #1634, from 2:30 - 3:30pm.

At lunchtime on Friday, grab your laptop and come to a workshop on the Knowledge Portals:

Navigating complex disease genetics: using the Knowledge Portal Network to move from SNPs to functional insights
Room 28C, Upper Level, San Diego Convention Center

We'll go over some basics, illustrate workflows, and answer questions about how you can use KPs to investigate SNPs, genes, or regions of interest and turn genetic data into insights about complex diseases.

Please sign up so we can plan for refreshments. We'll send you a reminder a few days beforehand. We look forward to seeing you there! Please contact us with any questions or suggestions for topics you'd like to discuss.

Monday, October 8, 2018

DIAMANTE GWAS dataset adds close to a million samples along with fine-mapping to the T2DKP

In a groundbreaking paper published today, Anubha Mahajan and colleagues (Mahajan et al., Nature Genetics 2018) report on a meta-analysis of unprecedented size for genetic associations with type 2 diabetes (T2D) along with fine-mapping analyses to identify causal variants that can suggest new therapeutic targets. We are pleased to provide access to the summary results as well as the results of the fine-mapping today in the T2D Knowledge Portal (T2DKP).

Working as part of the DIAGRAM (DIAbetes Genetics Replication And Meta-analysis) and DIAMANTE (DIAbetes Meta-ANalysis of Trans-Ethnic association studies) consortia, the researchers aggregated and meta-analyzed genome-wide association studies for about 900,000 individuals of European ancestry (about 74,000 T2D cases and 824,000 controls). The studies were imputed using the most comprehensive reference panels possible, and in all, the analysis considered about 27 million genotyped or imputed variants.

After performing T2D association analysis (both unadjusted and adjusted for body mass index) 243 loci were seen to be associated with T2D at genome-wide significance or better (p-value for association ≤ 5 x 10-8). Of these, 135 were novel--not detected previously in any T2D association analysis to date.

Within these loci, each of which included multiple significantly associated variants, the researchers performed approximate conditional analysis to determine whether the associations were independent of each other. They found surprising complexity within some loci; for example, the well-known TCF7L2 locus appears to include as many as 8 distinct association signals!

All of the T2D associations from this study may be viewed in the T2DKP. They are represented in two datasets, named "DIAMANTE (European) T2D GWAS" and "UK Biobank T2D GWAS (DIAMANTE-Europeans Sept 2018)."  Manhattan plots showing the distribution of the associations across the genome may be seen by selecting either the "Type 2 diabetes" or "Type 2 diabetes adj BMI" phenotypes from the phenotype selection menu on the T2DKP home page. On Gene pages of the T2DKP, the results may be viewed in tables of variant associations and in the interactive LocusZoom visualization (see below). Results from this study are also displayed on Variant pages of the T2DKP.

LocusZoom plot on the PPARG Gene page

The credible set analysis performed in this study is also incorporated into the T2DKP. On the "Credible sets" tab of Gene pages, you may choose to visualize any of the credible sets available for the region. Epigenomic annotations that overlap the positions of the variants in the credible set are presented in an interactive display that allows you to select particular chromatin states or tissues to view. In the example shown below, one of the credible sets in the TCF7L2 region includes just two variants, and the one with the highest posterior probability overlaps active enhancer regions in adipose and liver tissue--both of which are important for T2D.

Detail of the Credible sets tab of the TCF7L2 Gene page

The multiple causal variants identified in this study support previous investigations on the biological mechanisms behind T2D and suggest new hypotheses that will likely lead to therapeutic insights. After reading the paper and a blog post from the authors, we invite you to explore the results in the T2DKP and to contact us with any suggestions or questions!

Wednesday, September 26, 2018

New datasets and many new phenotypes in the T2DKP

Today we release several new datasets, including associations for many new phenotypes and individual-level data for secure interactive analysis, to the Type 2 Diabetes Knowledge Portal.

The AAGILE GWAS dataset, from the African American Glucose and Insulin Genetic Epidemiology (AAGILE) Consortium, brings more diversity of ancestry to the T2DKP, with meta-analysis of fasting glucose and BMI-adjusted fasting insulin associations from over 20,000 African American individuals. These results were combined with associations for over 57,000 individuals of European ancestry from the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) in a trans-ethnic meta-analysis.

This release also adds two new diabetic kidney disease datasets from the SUMMIT (SUrrogate markers for Micro- and Macro-vascular hard endpoints for Innovative diabetes Tools) consortium. All of the more than 40,000 subjects in the "Diabetic Kidney Disease GWAS: subjects with T1D or T2D" dataset had either type 1 or type 2 diabetes. The study measured seven different renal phenotypes in these subjects, including four that are new to the T2DKP. Summary association results are available for the entire group and for sub-cohorts that separate T1D from T2D and European from Asian ancestry. A separate dataset from SUMMIT, "Diabetic Kidney Disease GWAS: subjects with T1D or T2D, ESRD vs. controls" is comprised of more than 5,600 diabetics, nearly 1,200 of whom had end-stage renal disease. These two datasets greatly expand the range of diabetic complications for which genetic association data are available in the T2DKP.

The T2DKP is federated, meaning that in addition to the Data Coordinating Center at the Broad Institute, some results are drawn from a sister site at the European Bioinformatics Institute (EMBL-EBI). This system allows data that may not leave Europe to be represented in the T2DKP. Six of the new datasets in this release are housed at the T2DKP Federated Node at EMBL-EBI.

The Hoorn Diabetes Care System (DCS) dataset includes associations for 12 different anthropometric, blood lipid, blood pressure, and liver and kidney function measures for a cohort of over 3,400 type 2 diabetics in the Netherlands.

The GoDarts project (Genetics of Diabetes Audit and Research in Tayside Scotland) recruits type 2 diabetics and matching controls in the Tayside region of Scotland. This release includes five new datasets from GoDarts, representing experiments performed using different arrays. Each experiment determined genetic associations for a wide variety of phenotypes, including two that are new to the T2DKP: levels of adiponectin and leptin, hormones that are associated with risk of T2D and obesity.

Results from all of these datasets may be searched using the Variant Finder tool and may be browsed:

• On Gene Pages in the Common variants and High-impact variants tables and in LocusZoom plots;

• On Variant Pages in the Associations at a glance section, the Associations across all datasets section, and in LocusZoom plots;

• From the View full genetic association results for a phenotype search on the home page: first select a phenotype, then select a dataset on the resulting page.

Individual-level data from the Hoorn DCS and GoDarts datasets also power secure interactive analyses using the Genetic Association Interactive Tool (GAIT) on Variant Pages. With the new additional data, nearly 61,000 individual-level samples are now available for custom association analysis.

Please take a look at the new results and contact us any time with questions or suggestions!

Tuesday, August 14, 2018

Sign up for a hands-on tutorial session on the Knowledge Portals

Are you attending the American Society of Human Genetics meeting in October? If so, save your Friday lunch break for a tutorial session on the Knowledge Portals!

Navigating complex disease genetics: using the Knowledge Portal Network to move from SNPs to functional insights
12:30pm - 1:45pm
Friday, October 19
San Diego Convention Center
Room 28C, Upper Level

Bring your laptop and your questions about the Type 2 Diabetes, Cerebrovascular Disease, or Cardiovascular Disease Knowledge Portals (KPs). We'll go over some basics, illustrate workflows, and answer questions about how you can use KPs to investigate SNPs, genes, or regions of interest and turn genetic data into insights about complex diseases.

Please sign up so we can plan for refreshments. We'll send you a reminder a few days beforehand. We look forward to seeing you there! Please contact us with any questions or suggestions for topics you'd like to discuss.

Friday, June 22, 2018

New data release brings new phenotypes and huge sample sizes to the T2DKP

Progressing towards the goal of the Accelerating Medicines Partnership in Type 2 Diabetes (AMP T2D) to aggregate, analyze, and present comprehensive genetic data relative to T2D in order to speed up the validation of new drug targets, today we release 10 new datasets to the Type 2 Diabetes Knowledge Portal. These datasets contain variant associations for 17 phenotypes, including 7 that are new to the T2DKP, from over 1.4 million samples.

Four of the new datasets were generated by collaborators in AMP T2D, the parent organization of the T2DKP. AMP T2D is a pre-competitive partnership among the National Institutes of Health, industry, and not-for-profit organizations, managed by the Foundation for the National Institutes of Health, that supports the generation of genetic association data and many other kinds of genomic data as well as providing access to these data in the T2DKP, to facilitate the translation of these data into biological knowledge about T2D.

For all four of these datasets, quality control and association analysis were performed by the Analysis Team of the AMP Data Coordinating Center (AMP DCC) at the Broad Institute, using standard, state-of-the-art methods. These processes are completely transparent and fully documented: the experimental design and analysis are summarized on our Data page, and detailed reports are available for download. In this first phase of analysis, associations were determined for type 2 diabetes, fasting glucose levels, and fasting insulin levels--both unadjusted, and adjusted for body mass index. Future analyses will add more phenotypes.

One of these datasets,  Diabetic Cohort - Singapore Prospective Study GWAS, was contributed by collaborators at the National University of Singapore. Consisting of 3,864 samples, it is a T2D case-control study to identify genetic and environmental risk factors for diabetes in Singapore Chinese. The other three new sets that were analyzed at the AMP DCC, contributed by collaborators at the University of Michigan, are from the Finland-United States Investigation of NIDDM Genetics (FUSION) Study that seeks to to map and identify genetic variants that predispose to type 2 diabetes or affect variability in diabetes-related traits. The three FUSION datasets include FUSION GWAS, with 1,681 samples; FUSION Metabochip, with 2,163 samples, and FUSION exome chip analysis, with 3,485 samples.

All four of these datasets now have “Early Access Phase 1” status, which is assigned to new data. This status denotes that although analysis and quality control checks have been performed, the data are not yet considered to be in their final state. During the early access period, users may analyze the data but may not submit the results of these analyses for publication. Find the full details about the different phases of data release on our Policies page.

In addition to the datasets from APM T2D partners, we have also added or updated 6 new sets of publicly-available association summary statistics for phenotypes relevant to T2D:

  • The previous CKDGen GWAS dataset for chronic kidney disease has been replaced with a newer study from the CKDGen consortium, imputed to the 1000 Genomes reference set (Gorski et al., 2017), with 110,517 samples;
  • Early Growth Genetics Consortium GWAS associations for childhood obesity (Bradfield et al., 2012), with 13,848 samples;
  • Body fat distribution associations (Shungin et al., 2015), with 245,749 samples, have been added to the existing GIANT GWAS dataset;

Results from all the new datasets may be viewed at these locations in the T2D Knowledge Portal:

• On Gene Pages (e.g., GCKR) in the Common variants and High-impact variants tables and in LocusZoom plots;

• On Variant Pages (e.g.rs1260326) in the Associations at a glance section, the Association statistics across traits table, and in LocusZoom static plots;

• From the View full genetic association results for a phenotype search on the home page: select a phenotype and view the top variants in a Manhattan plot and table;

• Using the Variant Finder tool: specify multiple criteria and retrieve the set of variants meeting those criteria from any of these datasets.

Additionally, individual-level data from the Diabetic Cohort - Singapore Prospective Study GWAS and FUSION GWAS datasets are available for secure custom interactive analyses using these tools in the T2DKP:

• Using the Genetic Association Interactive Tool (GAIT) on Variant Pages, you may choose a phenotype for association analysis, choose custom covariates, filter the sample pool by specifying a range of values for one or more phenotypes, then run on-the-fly analysis.

• Dynamic LocusZoom plots on Gene and Variant pages allow you to run association analysis using one or more variants of your choice as covariates, in order to test whether associations are independent.

With today's release, the T2DKP includes genetic associations for 68 phenotypes from a total of 35 datasets. We welcome submissions of new datasets for incorporation into the T2DKP. Find information about collaboration here, and please contact us with questions.

Monday, June 18, 2018

See you at ADA!

The 78th Scientific Sessions of the American Diabetes Association are coming up in just a few days, and the T2D Knowledge Portal team will be there!

As usual, we'll have a booth in the exhibit hall. We'll be at booth #1075 from 10am to 4pm on Saturday and Sunday 6/23-24, and from 10am to 2pm on Monday 6/25. Come say hello, get a demonstration of the T2D, Cardiovascular Disease, or Cerebrovascular Disease Knowledge Portals, and pick up some of the T2DKP sticky notes that we'll be giving away!

Here's who you might find at the booth when you stop by:

There will also be presentations from several members of our group on Saturday, June 23:
  • Jason Flannick, PhD will give a talk on "The Type 2 Diabetes Knowledge Portal" at 11:30am.
Session: Quantifying Diabetes: Genomics, Electronic Health Records, and Automated Control
Location: W312
  • Jose C. Florez, MD, PhD, will moderate an interactive poster session, "Delving into Type 2 Diabetes Genetics", at 12:30 pm.
Location: Poster hall
  • Miriam Udler, MD, PhD will present "Genetic testing for Monogenic Diabetes--Whom to Test, What and How to Order?" at 2:15pm.
Session: Monogenic Diabetes Testing is Ready for Prime Time--Integrating Genetics into Your Practice
Location: W304E-H

We hope to meet you in Orlando!

Friday, June 1, 2018

New T2DKP features help distill knowledge from data

We are pleased to announce four new features in the Type 2 Diabetes Knowledge Portal that simplify the interpretation of genetic association data, making it easier to pinpoint variants and datasets that are informative for a disease or phenotype of interest.

"Clumping" variants by linkage disequilibrium

The first step in getting an overview of the results of a particular experiment is typically to plot variant associations vs. chromosomal location, in a so-called "Manhattan plot." These plots are available from the T2DKP home page after choosing a phenotype from the list:

After selecting a phenotype, you may select a dataset, and the Manhattan plot is displayed above a table of the top variants:

Now, in addition to selecting a dataset to view associations, you may select a threshold for linkage disequilibrium (LD) in order to reduce the number of linked variants that represent a single association signal. For example, without "clumping" variants by LD (r2 = 1), when viewing the DIAGRAM 1000G GWAS dataset there are 70 significantly associated variants in the IGFBP2 gene; but setting the most stringent LD threshold  (r2 = 0.1) reduces that number to just 8 variants by displaying only the most significant associations after clumping variants by LD. Intermediate LD thresholds of r2 = 0.2. 0.4, 0.6, or 0.8 may also be set, allowing more versatility in this analysis.

New Region page

The Gene page of the T2DKP (see an example) integrates and summarizes information about the associations of variants across the region of a gene. Now, you can see this integration and summation for any region of the genome, not just the areas surrounding protein-coding genes. Simply enter a chromosome and coordinates in the home page search box:

The resulting page resembles a Gene page. The traffic light integrates all associations across the region to give you an immediate indication of whether there are significant associations found in any of the datasets in the T2DKP. Further down the page, tools and displays let you drill down to the specifics for a phenotype or variant of interest. This new Region page provides a way to explore any part of the genome in great detail.

PheWAS graphic on the Variant page

Previously, the Variant page of the T2DKP displayed significant associations for each variant in a graphic that showed a color-coded box for each phenotype-dataset combination. But the rapidly increasing number of phenotypes becoming available from biobank studies has made this view unsustainably large. In its place, we have incorporated a phenome-wide association study (PheWAS) visualization developed at the University of Michigan. The graphic shows at a glance which phenotype associations are most significant for a particular variant. Mouse over a point to see more details.

All Associations graphic on the Variant page

The PheWAS graphic distills variant associations in order to highlight the most significant ones. But suppose you want to drill down to the details and explore associations in every dataset, viewing parameters like sample size, odds ratio, and more? There's a graphic for that too: our new All Associations interactive graphic, located in the "Associations across all datasets" section of the variant page. Start by using keywords to filter phenotypes. Filtering allows you to view one specific phenotype, several related phenotypes, or phenotypes in a broad category, such as glycemic phenotypes; both the graphic and the table below it change in response to phenotype filtering.  There are also options to filter by setting ranges of p-values and/or sample sizes.

The graph plots p-value (vertical axis) vs. dataset sample size (horizontal axis) for each association. Points in the graph are triangular; whether the triangle points up or down indicates a positive or negative direction of effect, respectively. Mousing over a point shows you more details about the association and the dataset. This graphic can help you evaluate whether an association is likely to be real. As shown in the illlustration below, a genuine signal should increase in significance (i.e., decrease in p-value) with increasing sample size.

Stay in touch!

Like the rest of the T2DKP, these features are under continuous development. Please give them a try and let us know what you think.