Tuesday, October 8, 2019

Connect with the Knowledge Portal Network team at #ASHG19

Attending the American Society of Human Genetics Annual Meeting next week? We are too, and we look forward to connecting with you in multiple venues:

Wednesday, October 16
  • Visit our booth (#131) in the exhibit hall from 10am-4:30pm
  • Attend our Ancillary session:
Translating Variant Associations to Functional Insights Using the Knowledge Portal Network
12:45-2:00 pm, Marriott Marquis Houston, Tanglewood room
Jesse Engreitz and Jason Flannick will speak, with an introduction from Noël Burtt and followup from Maria Costanzo.
  • Attend our presentation at the Broad genomics booth (#714) from 3-4pm
  • Attend the talk by Lokendra Thakur, “Calculating principled gene priors for genetic association analysis.” 4:45-5pm, Room 317A, Level 3, Convention Center

Thursday, October 17
  • Visit our booth (#131) in the exhibit hall from 10am-4:30pm
  • Visit the poster (#1657/T) by Ben Alexander, “Systematic comparison of different evidence sources for predicting GWAS effector genes” from 2-3pm
  • Visit the poster (#1402/T) by Dylan Spalding, “Federating association analysis in type 2 diabetes to protect participants’ privacy” from 3-4pm

Friday, October 18
  • Visit our booth (#131) in the exhibit hall from 10am-3:30pm
  • Attend our presentation at the Broad genomics booth (#714) from 10-11am
  • Visit the poster (#2804/F) by Peter Dornbos, “The functional impacts of rare coding variants in 46,000 individuals on 23 quantitative phenotypes” from 2-3pm

Saturday, October 19
  • Attend the talk by Marcin von Grotthuss, “Public programmatic access to GWAS summary statistics and analytical methods.” 8:45-9am, Room 310A, Level 3, Convention Center

Tuesday, September 24, 2019

Learn about the Diabetes Epigenome Atlas at this week's webinar

Because integration with other data types can bring more meaning to genetic association data and spark insights into disease, we are working to develop connections between the Type 2 Diabetes Knowledge Portal (T2DKP) and the Diabetes Epigenome Atlas (DGA).  Learn more about this effort in our upcoming webinar at noon EDT on Thursday, September 26. We’ll demonstrate the current and planned connections between the T2D Knowledge Portal and DGA, and our guest speakers Kyle Gaulton and Parul Kudtarkar from DGA will provide an overview of the data and tools in the DGA resource.

This session may be attended as an online webinar (connection information below) or in person at the Broad Institute in the Cascades meeting room (11031) on the 11th floor of the 75 Ames St. building. We'll record the session and make it available on the T2DKP and on the Broad Institute YouTube channel for future viewing.

We hope you will attend and bring your questions and suggestions!

Future dates for our T2DKP Webinar Series:

Thursday, November 14, 2019
Thursday, January 16, 2020
Thursday, March 12, 2020

All events will take place at 12 noon Eastern time. We will send more details about each webinar as it approaches.

Connection Information:

Join Zoom Meeting

One tap mobile
+16468769923,,642344149# US (New York)
+16465588656,,642344149# US (New York)

Dial by your location
        +1 646 876 9923 US (New York)
        +1 646 558 8656 US (New York)

Meeting ID: 642 344 149

Wednesday, September 11, 2019

Learn about the T2D Knowledge Portal at #EASD2019

The Type 2 Diabetes Knowledge Portal team will be exhibiting next week at the European Association for the Study of Diabetes (EASD) conference in Barcelona. Please stop by our booth (M07) to get a hands-on tutorial and let us know which data and features would be most useful to your research.

We'll have team members there both from the T2DKP Federated Node at the European Bioinformatics Institute (EBI) and from the AMP T2D Data Coordinating Center (DCC) at the Broad Institute.  The Federated Node allows data that may not leave Europe due to privacy regulations to be queried remotely and securely via T2DKP tools and interfaces. Researchers anywhere in the world may browse and query data from either location, without even needing to know where the datasets reside. Stop by the booth and talk with us about adding your results to the T2DKP!

Thursday, September 5, 2019

T2DKP Newsletter available

A new edition of our periodic newsletter is now available. Download it here for the latest news about T2DKP data and features!

Wednesday, August 28, 2019

New T2DKP release brings new datasets and interfaces

Today's release of the Type 2 Diabetes Knowledge Portal includes many improvements:

  • updated and augmented results for 7 datasets, generated by re-analysis of individual-level data; 
  • multiple new datasets; 
  • a new interface displaying predicted trait- and disease-relevant tissues; 
  • new functionality in the custom burden test; 
  • and new video resources.

New genetic association results

Using the LoamStream genomic analysis pipeline, developed by the T2DKP team at the Accelerating Medicines Partnership in Type 2 Diabetes (AMP T2D) Data Coordinating Center (DCC) at the Broad Institute, we have re-analyzed several sets of individual-level genetic association data that were generated by AMP T2D collaborators:
  • BioMe AMP T2D GWAS
  • Diabetic Cohort - Singapore Prospective Study GWAS
  • FUSION exome chip analysis
  • FUSION Metabochip
The LoamStream software allowed T2DKP analysts to run these analyses rapidly, determining associations for many more phenotypes than were analyzed previously, and to use standard, state-of-the-art methods so that all of the results are comparable across datasets. Each dataset was analyzed for associations with glycemic, lipid, renal, anthropometric, and blood pressure phenotypes, and in addition, the type 2 diabetes cases in the BioMe AMP T2D GWAS set were analyzed for associations with three diabetic complications: chronic kidney disease, end-stage renal disease, and neuropathy. The Loamstream pipeline generates detailed Quality Control and Analysis reports, which may be downloaded from the dataset-specific sections of the T2DKP Data page.

Results from these re-analyses are integrated into the T2DKP and may be viewed on Gene and Variant pages and in Manhattan plots. They may be searched using the Variant Finder, and the individual-level data from the GWAS sets may be securely accessed for custom association analysis using the Genetic Association Interactive Tool (GAIT) on Variant pages.

We have also incorporated summary statistics from several studies into the T2DKP, including:
  • IVGTT-based Insulin Secretion GWAS (Wood et al. 2017): genetic associations for first-phase insulin secretion, as measured by intravenous glucose tolerance tests in over 5,500 multi-ethnic non-diabetic individuals. Several of the phenotypes measured are new to the T2DKP: acute insulin response, insulin secretion rate, and peak insulin response.
  • GIANT exome chip analysis (Turcot et al. 2018; Justice et al. 2019): genetic associations for BMI, height, and waist/hip ratio adjusted for BMI. BMI and height associations were determined in over 718,000 individuals, and waist/hip ratio in over 344,000.
  • Global Lipids Genetics Consortium exome chip analysis (Liu et al. 2017), with associations for plasma lipid levels in over 347,000 participants.
  • Chronic Inflammation GWAS (Ligthart et al. 2018): associations with plasma C-reactive protein, a measure of chronic inflammation, in more than 312,000 individuals.
  • COGENT-Kidney Consortium eGFR GWAS (Morris et al. 2019): associations with estimated glomerular filtration rate (eGFR) in over 204,000 subjects.

FOCUS on tissue enrichments

Although the genetic association results in the T2DKP identify sequence variants that are associated with the risk of developing T2D, it is rarely straightforward to identify the genes that are responsible for these associations, and in which tissues they act. Making connections between variants, effector genes, and tissues is essential for a better understanding of disease genes and pathways and for the development of new therapeutics.

To help researchers make these connections, we are assembling a toolkit of cutting-edge computational methods and applying them across all of the data in the T2DKP. The methods integrate GWAS data with transcriptomic data, tissue-specific gene expression results, eQTL data, and more to predict the probability of associations between variants, genes, phenotypes, and tissues. We present these results in interactive FOCUS (Find Orthogonal Computational Support) tables.

We previously added to the Gene page a Gene FOCUS table that presents results to help researchers evaluate candidate causal genes around a genetic association signal (read our blog post describing this interface). Now, we have added a Tissue FOCUS table presenting results that can suggest which tissues or cell types may be relevant for a disease or trait of interest. 

The table is currently accessible via a link on the home page:

To use the table, choose a phenotype of interest to see p-values for different tissues, denoting the significance with which variants associated with that phenotype are enriched in each tissue. The methods used for these predictions are DEPICT, GREGOR, and LD score regression (LDSR). Find complete details about the table and methods in our downloadable documentation.

Improvements to the custom burden test

Some sequence variants are biallelic or multiallelic: that is, the reference nucleotide may be substituted by two or more different nucleotides or indels.  Previously, in the custom burden test (found on T2DKP Gene pages) these variants were treated as a single allele. Now, the software underlying the custom burden test has been updated so that it treats multiple alleles separately, offering the ability to choose whether each allele of a multiallelic variant should be included in or excluded from the custom burden test. Stay tuned for other major improvements to the burden test, including the ability to use several different aggregation test methods, coming to the T2DKP in the near future!

New videos

Continuing with our series of short videos documenting various features of the T2DKP, we are releasing a new video that describes the interactive table of predicted T2D effector genes. The recording of our most recent webinar, covering gene-specific resources in the T2DKP, is also now available. Links to both videos can be found on the T2DKP home page and Resources page, as well as on the Broad Institute YouTube channel.

Upcoming webinar

Join us for our next webinar on Thursday September 26 at noon EDT! We'll announce the agenda and connection information in this space in the coming weeks.

Thursday, July 11, 2019

T2DKP webinar Thursday, July 18

Join us at noon EDT on Thursday, July 18 for an interactive workshop featuring gene-specific resources in the T2DKP. We’ll first cover two new types of information on T2D gene associations: predictions of T2D effector genes, and gene-level T2D association scores. Then we'll delve into the Gene page with its comprehensive information for T2D and many other phenotypes, focusing on how the T2DKP can help researchers prioritize genes within a GWAS locus for further investigation. See below for the agenda.

This session may be attended as an online webinar (connection information below) or in person at the Broad Institute in the 415 Main St Board room (mezzanine level), where lunch will be provided.

We hope you will attend and bring your questions and suggestions!


Introduction - Noël Burtt

Gene-specific resources in the T2DKP - Maria Costanzo

Preview of upcoming features - Ben Alexander

Q & A - the T2DKP team

Connection Information:

Join Zoom Meeting

One tap mobile
+16468769923,,619080603# US (New York)
+16465588656,,619080603# US (New York)

Dial by your location
        +1 646 876 9923 US (New York)
        +1 646 558 8656 US (New York)
Meeting ID: 619 080 603

Thursday, June 6, 2019

New T2DKP release features potential T2D effector genes

Today, in a new release of the Type 2 Diabetes Knowledge Portal, we present a distillation of many years of work from the global T2D research community: a list of the genes most likely to represent effectors for the development of T2D, based on a heuristic developed by Anubha Mahajan and Mark McCarthy that takes into account a variety of genetic and genomic evidence.

Identifying such candidate effectors is the goal of the Accelerating Medicines Partnership in Type 2 Diabetes (AMP T2D), established in 2014. AMP T2D brought together stakeholders from government, academia, and industry in order to speed up translation of genetic data into insights about disease mechanisms and drug targets. The generation, aggregation, and analysis of unprecedented amounts of data in this collaborative effort has spurred efforts to develop methods for the systematic integration of data (see for example Fernandez-Tajes et al., 2019).

Now, by prioritizing and integrating multiple sources of evidence, Mahajan and McCarthy have classified genes according to the likelihood that they are involved in development of T2D.  The sources of evidence that they consider include genetic association data; functional genomic data such as eQTLs and chromatin conformation; mutant phenotype evidence from model organisms and knockdown screens in human cells; and other evidence gathered from the literature. The heuristic is described in detail in downloadable documentation.

Today's release of the T2DKP includes an interactive table that displays these classifications and allows you to view and explore all of the evidence underlying them.

Section of the Predicted T2D effector gene table. Columns are sortable, and columns containing combined evidence expand to show the individual evidence types comprising that classification.

When viewing this list, several caveats should be remembered. These are predictions only, and the strength of the predictions varies considerably among genes in the list. Also, any heuristic has limits, especially those developed in the absence of a clear "gold-standard" set, as this one was. Still, we hope that this list will be a valuable resource that can help suggest or support experimental directions for T2D researchers. We welcome feedback on the heuristic and the interface. Over the next year we plan to develop software to facilitate the generation and updating of these results.

Today's release of the T2DKP also includes 8 new datasets:

  • BioBank Japan GWAS (an overall set plus sex-stratified sets) bring to the T2DKP genetic associations for a wide range of phenotypes from over 190,000 individuals of East Asian ancestry. Phenotypes in these sets include many clinical measures as well as disease status for T2D, atrial fibrillation, and open-angle glaucoma.
  • Singapore Chinese Eye Study (SCES) GWAS, Singapore Malay Eye Study (SiMES) GWAS, and Singapore Indian Eye Study (SINDI) GWAS provide T2D associations for individuals of East Asian and South Asian ancestry.
  • Singapore Living Biobank GWAS datasets include associations with anthropometric and lipid traits for Chinese and Malay populations. 
All of these datasets are described fully on the T2DKP Data page.

Another new feature of today's release is that a link to standalone versions of our custom association analysis tools, the Genetic Association Interactive Tool (GAIT) and the Custom burden test, is now available on the Analysis Modules page. Both of these tools securely access individual-level data to compute on-the-fly genetic associations using custom parameters. GAIT, for single-variant association analysis, was previously only accessible on Variant pages; the Custom burden test for computing the disease burden across a gene was previously accessible only on the High-impact Variants tab of Gene pages. 

Finally, today's release includes a new instructional video that leads you through the features of the T2DKP Variant page. The video is listed on, and linked from, the T2DKP Resources page.

Check out our latest newsletter for more details about these and other recent additions to the T2DKP.

Monday, June 3, 2019

See you at ADA next weekend!

The Type 2 Diabetes Knowledge Portal team will once again be presenting an exhibit booth at the 79th Scientific Sessions of the American Diabetes Association in San Francisco next weekend. This year, we're excited to be presenting along with our collaborators from the Diabetes Epigenome Atlas (DGA).

Stop by the booth (#2306) to get a personal, hands-on demonstration of the new tools and features, or just to say hello and let us know what new data and features you’d like to see in the T2DKP or DGA.

We’ll be there during all the exhibit hall hours:

Saturday, June 8:     10am-4pm
Sunday, June 9:       10am-4pm
Monday, June 10:    10am-2pm

Please email us if you would like to schedule a 1:1 tutorial session at a particular time, or just stop by our booth. We hope to see you there!

Wednesday, May 22, 2019

T2DKP now offers a T2D-specific exome sequence collection of unprecedented size

The largest known exome sequence analysis specific to a complex disease was published today in Nature, and all of the results are now freely available in the Type 2 Diabetes Knowledge Portal (T2DKP) to support researchers worldwide as they make decisions about how to prioritize potential T2D drug targets for investigation. The paper, “Exome sequencing of 20,791 cases of type 2 diabetes and 24,440 controls” (Flannick et al.), describes a multi-ancestry analysis of both variant-level and gene-level genetic associations for type 2 diabetes.

The paper is the culmination of years of work from a global collaboration to generate exome sequences across five ancestry groups. The project began as an effort by the Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples (T2D-GENES) consortium to perform exome sequencing and T2D association analysis for about 13,000 samples, and evolved into a consortium of consortia—about 30 international sites in all, including the GoT2D, ESP, SIGMA, LuCAMP, and ProDIGY consortia—that partnered to design a study including as many exomes as possible. The Accelerating Medicines Partnership in Type 2 Diabetes grew out of this effort, and today supports a wide range of genetic association and other studies aimed at elucidating the mechanisms behind T2D, as well as supporting the T2DKP to serve these results to the world.

The study included participants of African American, East Asian, European, Hispanic/Latino, and South Asian ancestry. The researchers sequenced exomes (the protein-coding regions of the genome) from these participants and performed gene-level association analysis in order to detect rare variants and uncover allelic series within genes. They also performed single-variant association analysis for a subset of the samples using genome-wide arrays and imputation. A comparison of the two methods confirmed that the strength of exome sequencing is its ability to identify informative, often rare, alleles that may yield clues to disease mechanisms, while array-based GWAS provides a more comprehensive picture of strongly associated loci.

The researchers found exome-wide significant gene-level T2D associations for three genes (MC4R, PAM, and SLC30A8). Replication of the gene-level associations in a meta-analysis of three independent exome sequencing datasets confirmed the significance of these associations and found exome-wide significance for a fourth gene, UBE2NL. The variant alleles uncovered in these genes are effectively “experiments of nature” that may subtly alter the structure, function, or stability of the gene products and could be very helpful in suggesting further research directions to discover the roles of these proteins in T2D risk.

But what of the other genes whose gene-level associations didn’t meet exome-wide significance? Suspecting that these associations could still provide valuable information, the authors decided to test whether these association scores were meaningful. They created sets of genes that were known or likely to have a role in T2D risk: for example, genes known to be T2D drug targets, genes in which mutations cause maturity onset diabetes of the young (MODY), or genes whose mouse homologs confer glycemic phenotypes when knocked out. In each set, genes in the sets had more significant gene-level T2D associations than would be expected by chance, suggesting that their scores were meaningful despite relatively low statistical significance. Analysis of additional sets of genes, for example those located in strongly T2D-associated GWAS loci, supported this conclusion.

Thus, although future studies with larger sample sizes will be needed to uncover strongly significant gene-level associations, the associations generated from this study can still provide evidence to support prioritization of research effort and resources. For example, the gene-level scores could help suggest which gene in a T2D-associated locus is most likely to be relevant to T2D. The series of variant alleles in individual genes that were identified in this study could help indicate whether it is gain or loss of protein function that affects T2D risk, an important piece of information for drug development.

So that researchers worldwide may benefit from these results, with agreement from all of the authors the results were made available in the T2DKP when the pre-print of the paper was posted to BioRxiv. “A main message of the paper is that rare variants potentially provide a much more valuable resource for drug development than previously thought,”  said Jason Flannick, first author on the paper. “We can actually detect evidence of their disease association in many genes that could be targeted by new medications or studied to understand the fundamental processes underlying disease. But because there is so much more information than just the variants in the genes cited in the paper, making all of the results available to everybody is critical for them to have the largest impact.”

In the T2DKP, this dataset is termed the AMP T2D-GENES exome sequence analysis set and is described on the Data page. The single-variant T2D associations may be browsed and searched throughout the T2DKP: on Gene and Variant pages, in Interactive Manhattan plots, and via the Variant Finder tool. The Genetic Association Interactive Tool (GAIT) for single variants and the custom burden test for genes provide secure interaction with the individual-level data from this set, allowing the user to filter samples and set custom parameters before performing on-the-fly association analysis.

The gene-level association scores are displayed in the T2DKP via two avenues. A new page lists genes with their association scores and other information such as the number of variants used to calculate the score. The variants comprising the scores may be filtered by any of 7 different categories, and the results of two different aggregation test methods are also available. Gene-level scores are also shown in the Gene Prioritization Toolkit on Gene pages. See our recent blog post for a description of this interface.

In addition to the sheer volume of these exome sequencing results, their open availability in the T2DKP is a remarkable milestone for the diabetes genetics research community. "I believe the T2D genetics community is setting examples both for human genetics, in data aggregation and joint analysis, and in its commitment to sharing of these results on an open platform enabling non-experts to make direct use of the results," says Noël Burtt, Director of Operations and Development for Knowledge Portals and Diabetes at the Broad Institute. The T2DKP team is proud to be a part of this collaborative effort.

Read the press release

Tuesday, April 23, 2019

New Gene Prioritization Toolkit adds value to GWAS results

Genetic association data from genome-wide association studies (GWAS) are foundational for our understanding of type 2 diabetes and other complex diseases. But in order to apply these results to diagnosis, drug development, and treatment, we need to identify the effector genes that explain those genetic associations. This is rarely straightforward: most SNPs associated with disease are located outside of coding regions of the genome, so that their impact on genes is not obvious; and even a variant located in a protein-coding gene may actually affect a different gene. And to complicate things further, a variant that is strongly associated with disease may not have a direct impact on a gene, but may rather be "along for the ride" with a tightly linked causal variant.

Today we have released a prototype, experimental version of an interactive tool in the Type 2 Diabetes Knowledge Portal that can help bridge the gap between genetic association results and the effector genes that are directly involved in disease. We are aggregating additional data types—for example, transcriptional regulation, tissue specificity, curated biological annotations, and more—and integrating them using cutting-edge computational methods in order to mine insights from GWAS data. The new Gene Prioritization Toolkit presents these data types and results to help researchers evaluate candidate causal genes around a genetic association signal.

As a first step in developing this tool, we needed to find a way to store many different connections between variants, genes, tissues, phenotypes, and biological annotations. We decided to use a Neo4J graph database, which holds data nodes and their relationships with each other and can support complex, scientifically meaningful queries.

Neo4J graph showing variants on chromosome 8 that are associated with glycemic phenotypes. Orange circles represent variants; pink, p-values; blue, phenotypes; red, phenotype group; green and brown, variant annotations.

We have also created pipelines to apply computational methods to the genetic association data in the T2DKP. In brief, we are currently running:
  • MetaXcan, which integrates tissue-specific expression data from GTEx and genetic association data to predict the potential that a gene is causal for a phenotype in a given tissue;
  • DEPICT, which integrates multiple data sources including transcriptional co-regulation, Gene Ontology annotations,  model organism phenotypes, and more to predict membership of a gene in a pathway and the probability of its association with a given phenotype;
  • eCAVIAR and COLOC, two methods that quantify the probability that a variant is causal in both genetic association and eQTL studies.
We present the results of these methods in an interactive table on a new tab of the Gene page (see an example), "Genes in region". 

In addition to the results of the methods listed above, the table includes gene-level T2D associations generated by two types of burden test (Firth and SKAT) from an analysis of nearly 50,000 exome sequences by Jason Flannick and colleagues, as well as the phenotypes of knockout mice that are mutant for homologs of the human genes in the region, from the Mouse Genome Database. All of these methods and data types are described in more detail in our downloadable help documentation for the new interface.

The table shows all of these data types for each gene across the region. It has two alternative views: the Significance view, in which table cells are color-coded by significance, and the Records view, in which shading indicates the number of records in each cell. This visual summary allows you to compare genes quickly across methods. Clicking on a cell opens a window listing full details of the results.

The table also supports versatile sorting. Columns may be dragged and dropped in order to group comparable genes, as shown below:

Default view of the Gene Prioritization table. Columns represent genes and rows represent methods or data types. Cell color denotes significance, with darker shades indicating higher significance.

The same table after custom re-ordering of columns to group three genes that all have significant eCAVIAR and COLOC scores.

In addition, the table may be transposed so that the columns represent methods and the rows represent genes. This allows sorting by significance within a method, so that the gene with the most significant result for each method is easily identified.

This entire system, from data storage through the computational pipelines through the user interface, has been designed to be flexible and modular so that in the future we will be able to add new methods and data types easily and rapidly. As we actively develop the system, we are very interested in feedback from researchers about how to improve it. Please try it out and let us know what you think!

Thursday, April 18, 2019

GPS information for BMI and obesity now available in the CVDKP

Genome-wide polygenic scores (GPS) have great potential for helping to advance research on complex diseases and traits. Not only can they help predict individual genetic risk, but they can also help us understand the physiology of disease, by identifying groups at the extremes of risk whose clinical profiles can be studied or who may be enrolled in clinical trials.

Following up on their previous work that generated GPSs for five complex diseases, co-lead authors Amit Khera and Mark Chaffin, along with senior author Sekar Kathiresan and colleagues, have now developed a GPS for body mass index (BMI) and obesity, published today in Cell. To help promote obesity research, the authors have provided an open-access file listing the variants and weights that comprise the GPS. That file is now available for download from the Downloads page of our sister Knowledge Portal, the Cardiovascular Disease Knowledge Portal.

To generate this GPS, Khera and colleagues started with a large, recently published genome-wide association study (GWAS) for BMI in more than 300,000 UK Biobank participants (Locke et al., 2015) and applied an algorithm that assigned a weight to each of 2.1 million variants, also taking into account factors such as the proportion of variants with non-zero effect size and the degree of correlation between a variant and its neighbors. They validated the GPS by applying it to nearly 120,000 additional UK Biobank participants, finding that the score was strongly correlated with measured BMI, and then applied it to four independent testing datasets.

We don't have space here to cover the many interesting details uncovered by the researchers, but overall, this work shows that a high GPS strongly predicts increased risk of severe obesity, cardiometabolic disease, and all-cause mortality. Those with the very highest GPS had a level of risk for obesity similar to that conferred by a rare monogenic mutation in the MC4R gene.

The GPS has the potential to be a powerful tool for people struggling with overweight and obesity. "Importantly, we are in the early days of identifying how we can best inform and empower patients to overcome health risks in their genetic background," said Khera in a press release from the Broad Institute. "We are incredibly excited about the potential to improve health outcomes."

We invite you to read the paper, take a look at the file of variants and weights freely available from the CVDKP Downloads page, and contact us with any questions!

Wednesday, April 3, 2019

New Hoorn DCS dataset available in the T2DKP via federation

A new dataset, "Hoorn DCS 2019," is now available in the Type 2 Diabetes Knowledge Portal via the T2DKP Federated node at the European Bioinformatics Institute (EBI). The Hoorn Diabetes Care System (DCS) cohort is a prospective cohort of type 2 diabetics in the West Friesland region of the Netherlands, for whom clinical measurements are collected annually. Association analysis was performed at EBI across 1,997 samples for 16 phenotypes, including glycemic, anthropometric, cardiovascular, and renal traits. The Hoorn DCS 2019 dataset is described in detail on the T2DKP Data page.

This new dataset is housed at the EBI Federated node of the T2DKP, which enables researchers to interact with results that may not be transferred to the AMP T2D Data Coordinating Center (DCC) at the Broad Institute because of institutional, regional, or national regulations. Data at the EBI node are stored in such a way that their specific privacy requirements are met, but they are available for secure remote queries via T2DKP tools and interfaces. Results from such queries are served up alongside results from all of the datasets housed at the AMP T2D DCC, such that researchers may browse and query data from any location without even needing to know where the data reside. This federation mechanism represents both an important technical advance in handling and protecting data, and a significant step forward in democratizing and improving access to genetic association results. Results at the EBI Federated node now comprise 9 datasets, nearly 40,000 samples, and associations for a wide variety of phenotypes.

Summary results from all of these datasets are integrated into Gene and Variant pages in the T2DKP, and may also be viewed in interactive Manhattan plots or queried using the Variant Finder tool. The individual-level data behind the datasets are accessible for custom association analysis in our Genetic Association Interactive Tool (GAIT) on Variant pages. Using this tool, researchers can filter samples to create a custom subset with defined characteristics such as age, gender, BMI, and other measures, and then run on-the-fly association analysis within that sample subset.

Please take a look at the new dataset and contact us with any questions or comments!

Wednesday, March 27, 2019

Get more help using the T2DKP: videos and webinars

Although we try hard to make all of the contents and interfaces of the Type 2 Diabetes Knowledge Portal clear and user-friendly, they can still be difficult to understand—especially for scientists who are not experts in human genetics. It can be hard to know where to get started among the dozens of complex datasets, several data types, multiple phenotypes, and custom analysis tools included in the Portal. So we're starting to produce two different kinds of video content that will complement our written documentation and satisfy all you auditory learners out there.

First, we're creating short videos (5 minutes or less) that focus on particular aspects of the data and features. Our first one (view on YouTube) is an overall introduction to the T2DKP and the Knowledge Portal architecture. Stay tuned for more in the coming months! And if you would like to see a video on a particular subject, please let us know.

Second, we're planning regular webinars that you can join online. The first of these hour-long sessions gave an overview of the project that created the T2DKP, the data it contains, the major entry points to the results, and a preview of future directions. You can view a recording of the webinar here and download the slides here. If you would like to be notified in advance of these sessions, be sure to sign up for our email list.

Please check out these videos and let us know what you think!

Tuesday, March 19, 2019

Exciting new exome collection now in the T2DKP, along with new datasets and features

The latest release of the Type 2 Diabetes Knowledge Portal includes seven new or updated datasets and several new features.

The highlight of this release is the largest collection of disease-specific exomes to date: exome sequences from 20,791 T2D cases and 24,440 controls, from the AMP T2D-GENES collaboration (Flannick et al., 2019). As well as generating T2D association statistics for individual variants, the analysis produced gene-level T2D association scores that can help prioritize research into new drug targets. These associations are available on a new page of the T2DKP, accessible from the home page. Select "View gene associations":

to view a table listing gene-level association scores:

The table may be filtered by using any of 7 different filters or two aggregation tests. Full details of the filters and methods are described in the preprint. Individual-level data from the AMP T2D-GENES exome sequence analysis dataset are also available for secure custom analysis in the Custom burden test (accessible on Gene pages) and Genetic Association Interactive Tool (GAIT; available on Variant pages).

Many other new and updated datasets are included in this release:

  • We've done further analysis of the BioMe AMP T2D GWAS dataset to generate associations for three complications of T2D: chronic kidney disease in diabetics, end-stage renal disease in diabetics, and diabetic neuropathy. 
  • The FinnMetSeq exome sequence analysis dataset (Locke et al., 2019) includes associations for more than 60 phenotypes, many of them new to the T2DKP, from nearly 20,000 samples.
 is a meta-analysis of T2D associations in over 23,000 individuals of African American ancestry (Ng et al., 2014).
  • VATGen GWAS (Chu et al., 2017), which includes associations for several fat distribution phenotypes, has been updated with sex-stratified cohorts
Liver function GWAS (Chambers et al., 2011) analyzed more than 61,000 samples for associations with four liver enzymes.
  • Genetic Factors for Osteoporosis Consortium GWAS
 (Morris et al., 2019) includes associations for estimated bone mineral density and fracture for more than 426,000 UK Biobank participants.

Find more information about each of these datasets on the T2DKP Data page. Results from each of these studies may be viewed in Interactive Manhattan plots and on Gene and Variant pages, and may be searched using the Variant Finder tool.

This release also includes several new features for the T2DKP. In addition to the new page of gene-level results for the AMP T2D-GENES exome sequence analysis dataset, described above, we've added:

  • Bottom-line p-values for variant associations across different phenotypes, available as an additional option in the PheWAS plot on the Variant page. These associations are derived from a new analysis method, based on METAL, that meta-analyzes all datasets in the T2DKP while taking into account the overlaps between sample sets.

A new Analysis modules page
, linked from the T2DKP home page, that provides quick access to tools. Currently, the tools available from this page are the Interactive Manhattan Plot, the Variant Finder, and the Genetic Risk Score tool; more will be added in the near future.

In response to feedback from T2DKP users, we've changed the Common variants tab on the Gene page (which only displayed variants whose minor allele frequency was greater than 5%) to a Top variants
 tab, showing the most significantly associated variants regardless of MAF.

Please contact us with any questions or comments about this new release!

Friday, March 15, 2019

New funding opportunity from FNIH

The Foundation for the National Institutes of Health is announcing a new Request for Proposals (RFP) to augment and expand the Type 2 Diabetes Knowledge Portal.

RFP16, due April 15, aims to facilitate prioritization of potential causal variants or genes in risk loci or credible sets by:

a. Developing preliminary functional annotation data to support strength of evidence for candidate genes for diabetes and complications (cardiovascular, obesity, diabetic and chronic kidney disease, NASH and liver function, diabetic retinopathy, glycemic traits); and/or

b. Generating mechanistic hypothesis for causal genes from cell-based genome-wide functional or perturbation studies (e.g. RNA-Seq, ATAC-Seq, CHIP-Seq, others); and/or

c. Developing and employing novel tools such as CRISPR tiling to combine with readout such as RNA-Seq to enable prioritizing potential causal variants or genes.

Find full details and contact information here.

Wednesday, March 13, 2019

Upcoming T2DKP webinar March 21

Do you need a little help navigating the tools and interfaces of the Type 2 Diabetes Knowledge Portal or other portals in the Knowledge Portal Network? Attend our webinar on Thursday, March 21 at noon EDT! We'll introduce the motivation for creating the Portals, cover the basics of using them to accelerate your research on complex diseases, give you a preview of what's coming next, and answer any questions you have. Join the webinar here:

Dial by your location
        +1 646 876 9923 US (New York)
        +1 646 558 8656 US (New York)
Meeting ID: 469 259 629
Find your local number: https://zoom.us/u/acotfwN2We

If you're in the Boston area and would like to attend in person (and join us for lunch), please let us know here.

Friday, March 1, 2019

Faster access to tools from the T2DKP home page

We've rearranged some of the links on the Type 2 Diabetes Knowledge Portal home page, as a first step towards offering a central location for analysis tools. The previous link to the Variant Finder tool has been replaced by a link to the new Analysis modules page:

The new page, shown below, offers access to three analysis tools.

  • The Interactive Manhattan plot allows you to choose a phenotype and view variant associations across the genome for that phenotype.  We've added phenotype selection options to both the Analysis modules page and the Manhattan plot page, making it easier to switch your view between phenotypes.  The default view on the Manhattan plot page shows the largest dataset for a phenotype, but when multiple datasets exist, you can select any one to display. For many datasets, LD clumping is available at several r2 thresholds. Clumping reduces redundancy due to association signals from linked variants, pinpointing the most strongly associated variant in a group.
  • The Variant Finder is a versatile tool that allows you to set multiple criteria (phenotype, p-value, size and direction of effect, and more) and retrieve the set of variants meeting those criteria.
  • The Genetic Risk Score (GRS) module takes a set of 243 variants associated with T2D at genome-wide significance or better, as determined by Mahajan et al. in their DIAMANTE (European) study (Mahajan et al., Nat Genet. 2018 Nov;50(11):1505-1513), allows users to select a phenotype and dataset, and computes the p-value for association of the variant set with the selected phenotype. The interface offers two customizing options: the ability to edit the variants in the set, and the ability to filter the samples by multiple phenotypic criteria before running the analysis. The GRS module, which will be augmented and improved in the future, can potentially reveal relationships between phenotypes. 
The new Analysis Modules page will be the central access point for new analysis tools as they are developed, so check back often for updates!

Wednesday, February 20, 2019

T2DKP Winter Newsletter

The latest issue of our periodic newsletter is now available. Download it here and get the latest!

Friday, February 15, 2019

New AMP T2D funding opportunities

The Foundation for the National Institutes of Health is announcing two new Requests for Proposals (RFPs) to augment and expand the Type 2 Diabetes Knowledge Portal.

RFP 8b, "Evidence-based target lists: portal visualization and multi-algorithm development", has these objectives:
1. Develop algorithms including genetic (noncoding and coding), epigenetic, and genomic data to prioritize and rank evidence for causal genes conferring T2D or other complications.
2. In collaboration with the Knowledge Portal (KP team) at the Broad Institute, build tools for the KP to visualize these data.
3. Build processes to regularly update (i.e. every quarter) highly ranked causal genes with underlying algorithm annotations.
4. Compare and validate algorithms to advance AMP-T2D KP analytic tools and data visualizations.

The objective of RFP 10, "Deposition of Available Diabetes Complications Data", is to deposit, harmonize and publicly display GWAS, exome, and whole genome data within the KP for any of the following traits:
• Chronic Kidney Disease and/or Diabetic Kidney Disease related traits
• NASH and liver disease related traits
• Cardiovascular and lipid related traits, including heart failure
• Obesity related traits
• Diabetic retinopathy
• Other complications

Proposals are due on March 15, 2019. Find full details and contact information here.