Monday, June 20, 2016

Report from New Orleans: 76th Scientific Sessions of the American Diabetes Association


Members of the T2D Knowledge Portal team braved extreme heat and humidity, as well as icy air conditioning, to attend the American Diabetes Association conference in New Orleans, LA. Our booth in the conference exhibit hall was a great way to interact personally with conference attendees and showcase the Portal. Many genetics researchers stopped by for one-on-one tutorials on our new tools and features. And clinicians and diabetes patients, even if they had no immediate use for genetic information, were happy to hear the goals of the project—to accelerate the identification of genes involved in T2D and, ultimately, to find new treatments and better understand the disease mechanism. 

We were pleased to welcome some special visitors to our booth: National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Director Griffin Rodgers and Deputy Division Director Philip Smith. NIDDK is a major supporter of the T2D Knowledge Portal project.

Drs. Philip Smith (left) and Griffin Rodgers visit the Portal booth


Dr. Smith also made an video statement as part of the media coverage at ADA, eloquently explaining the rationale behind the Portal and the needs that it can address.





If you missed us at ADA, come visit us at our booth at the American Society for Human Genetics meeting next October! And if you can’t meet us in person, please feel free to email us at any time. We’re happy to answer questions or provide help in understanding the Portal data and tools.

Type 2 Diabetes Knowledge Portal team at ADA

Friday, June 10, 2016

Come meet the Portal team at ADA, booth #1762!

Today’s news comes to you from the Big Easy—New Orleans, LA, where the 76th Scientific Sessions of the American Diabetes Association are in full swing this weekend. Members of the Knowledge Portal team have traveled here to talk to researchers about how the Portal can become even more useful in helping to generate hypotheses that spark insights into the mechanism of T2D and the development of new therapies. Starting at 10am on Saturday June 11, we’ll be at booth #1762 in the exhibit hall, ready to hear your suggestions and give you an individual tutorial on the Portal’s tools and features. There just might be a gift waiting for you, too!

We’ve been working hard and we have an incredible number of new features to show off at #2016ADA. We’ll be featuring them individually in this space in the coming weeks, with in-depth explanation of each. To list some of the highlights:

  • a collaborative project between software engineers at the University of Michigan and the Broad Institute has come to fruition with the integration of LocusZoom into the Portal. This interactive visualization looks, superficially, like a Manhattan plot—but it’s so much more. It shows the significance of variant associations with any of several phenotypes and also displays linkage disequilibrium among nearby variants, and you can choose to do conditional analysis based on any variant.
  • engineers at the Broad Institute have developed a completely new tool, called Genetic Association Interactive Tool (GAIT), that offers a multitude of options allowing you to compute custom association statistics for a variant. You can specify the phenotype to test for association, stratify samples by ancestry, choose a subset of samples to analyze based on specific phenotypic criteria, and control for specific covariates. 
  • we’ve also redesigned and augmented many of the displays of pre-computed information that are available in the Portal
  • finally, we’ve added a lot of new, informative content: a Data page with a complete description of each data set in the Portal, more background about the AMP-T2D project that supports the Portal, and more help text to guide you as you use the Portal’s interfaces



Come to the booth and let us give you a tour of these new features—or, if you're not at ADA, take a look and let us know what you think. And take a look at this great press release from NIH about the project!

Wednesday, May 18, 2016

Expanding the landscape of human genetic variation data in the Type 2 Diabetes Knowledge Portal

With the addition of four new sequence data sets to our database, the number of variants and associations accessible via the Portal pages and tools has increased by millions.

Two of the new data sets are from projects that have obtained sequence data from a wide range of individuals. The ExAC data set, comprising exome sequences collected and harmonized by the Exome Aggregation Consortium, includes sequence data from 60,706 unrelated people of multiple ancestries. The 1000 Genomes data set, from the International Genome Sample Resource project (IGSR), is composed of whole-genome sequences from 2,504 individuals in four different ethnic groups. 


The allele frequencies of variants in the different ethnic groups surveyed in the 1000 Genomes data set can be seen in the “How common is…?” section on the Variant pages (view an example). And both the ExAC and 1000 Genomes data sets can be queried using the Variant Finder tool. You can select them via a new tab on the interface, “Additional search options”, where you can choose these data sets and also add more criteria to your search. 

The Data set pull-down menu on the "Additional Search Options" tab of the Variant Finder lets you specify 1000 Genomes or ExAC data.

Available selections in the Data set pull-down menu.


The other two new data sets in the Portal were both generated by the GoT2D consortium. A whole-genome sequence data set (GoT2D WGS) adds data from 2,657 individuals, including the associations of noncoding variants that were not present in the previous whole-exome sequence data set from the GoT2D project. This new data set brings T2D association data across 30 million variants to the Portal. The GoT2D WGS + replication data set adds imputation to that set, bringing the sample size to over 47,000 and including most low-frequency and common variants.  

The new GoT2D data can be seen in multiple sections of the Portal’s Gene and Variant pages, and may also be accessed by selecting these data sets in the Variant Finder.

In addition to these major new additions, today’s release of data also includes some bug fixes and data harmonization.

Get out there and explore the new data landscape in the Portal, and let us know what you think!

Monday, May 9, 2016

Better summaries of variant information convey the most important information at a glance

We’ve made significant improvements to the information we display on the Variant pages of the T2D Knowledge Portal. The summary at the top of each Variant page (view an example) now shows the reference nucleotide and the variant nucleotide at that position. Transcripts covering the variant are listed, along with several important details for each transcript: the change caused by the variant in the encoded protein sequence (if applicable); the Sequence Ontology term describing the consequence of the variation (for example, “missense variant”); and the expected effect of the variant on protein function, as predicted by the PolyPhen and Sift algorithms.


Summary section of the Variant page

Just below the summary on the Variant page, we’ve also improved the graphic showing the association of the variant with T2D and related traits. We’ve re-named this section “associations at a glance” because it immediately shows the most important information about these associations. 


At-a-glance section of the Variant page. Click the image to view a larger version.


The boxes in this graphic represent the associations of this variant with T2D (at the top) and with other traits (below, in an expandable section). Under the hood, the software is now pulling up information more quickly so that the display is more responsive. We’ve also made it more pleasant to look at, tidying up the shape of the boxes and the alignment of the information they contain.

But beyond the style improvements, we’ve added a lot of substance. Where available, each association now includes the odds ratio (for dichotomous traits) or the effect size (for continuous traits) and the direction of effect. Positive effects are shown in blue, and negative effects in purple. 

We’ve also added the sample size, in black text in the bottom left corner of the box, for each data set. This indicates the total number of individuals involved in the study. And if available, the frequency and count of the variant in the data set are shown in red and blue text at the bottom middle and bottom right corner of the box, respectively. The count indicates the number of haplotypes in the set that contain the variant, while the frequency indicates the occurrence of the variant allele in the sampled population.

This additional information can help you evaluate the significance of associations. The sample size and variant count determine the power of the data set to establish the association. The higher the power, the more accurate the estimate of the variant’s effect.

Finally, when a variant is associated with other traits in addition to T2D, those traits in the same category are labeled with the same color. For example, in the display above, proinsulin levels, fasting glucose, HOMA-B, and two-hour glucose—all glycemic phenotypes—are labeled in orange, while triglycerides, LDL cholesterol, and cholesterol—lipid phenotypes—are labeled in red. This lets you see easily when a variant is linked to multiple traits that could reflect a common process or pathway, possibly offering a clue to the mechanism by which it affects physiology.

So this improved graphic now gives you an idea, literally at a single glance, of how strongly a variant is associated with T2D, how significant that association is, and whether it is also associated with other traits. 

We made these improvements in response to suggestions from scientists who use the T2D Knowledge Portal. We hope to hear your feedback too!

Friday, May 6, 2016

T2D Knowledge Portal in the news

The poster that we presented at the Biocuration 2016 conference was selected by F1000Research as the featured poster or slide of the month! As an organization promoting open access to publications and data, they were particularly interested in the challenge we face at the Portal in designing tools that allow researchers to gain valuable insights from the data while still protecting confidential patient information. Read their take on it in their blog post.


Thursday, April 28, 2016

Variant Finder results may be saved, shared, and bookmarked

You may have noticed that our Variant Finder tool has a cleaner look and clearer instructions. But did you know that you can also save your search parameters, to re-create your search later or share it with a colleague?

First, construct your search. Here’s an example:

Click the image to view a larger version

After you click “Submit search request” you’ll be taken to the results page:

Click the image to view a larger version


And here’s the URL of the results page for this example search:


It isn’t pretty, but it encodes the search. You can bookmark it, save it, or email it and you’ll get back the same result next time you enter it in a browser.

There’s one small caveat here. On the results page, you can modify the results table by clicking on the + signs in the table header to see options for adding more data to the table. But if you do this, those changes will not be encoded in the URL (we plan to enable this in the future); only the original search is encoded.

Let us know how you like this feature and what other features might be useful to you. And check out our mini-tutorial on the Variant Finder to see full instructions on how to use this tool. 

Thursday, April 21, 2016

Type 2 Diabetes Knowledge Portal represented at Biocuration 2016 conference

Last week, the International Society for Biocuration held its 9th annual conference in Geneva, Switzerland. You might ask, “What is biocuration, anyway?” In a nutshell, it’s all about organizing biological data and making it accessible and understandable. It can be as small-scale as capturing the fine details about the function and role of a particular protein, or as large-scale as designing interfaces to analyze and explore genomes or huge genomic data sets. (See this article if you’re interested in the nitty-gritty details about what it’s like to be a biocurator.) The conference covered major topics in biocuration such as the visualization and integration of data, controlled vocabularies and ontologies, functional annotation, community curation, text mining, and more.

Our Manager of Content and Community attended the conference, since many of these issues are relevant to the T2D Knowledge Portal. As we tackle a relatively new challenge in biocuration—the integration of human genetic association data sets—it’s important for our project to be part of the biocuration community, to get feedback and become aware of others’ work in this area. And as we consider adding more biological information about human genes to the Portal, it’s important that whatever we do is consistent with ongoing efforts in the biocuration of human genes; we don’t want to reinvent the wheel or duplicate work.

Besides getting to attend a fascinating and energizing conference in a beautiful setting, the icing on the cake was that our poster on the Portal received one of five “Best poster” awards! We’re honored and pleased that our project had such a warm welcome into the biocuration community.


View or download original