The Accelerating Medicines Partnership in Type 2 Diabetes (AMP T2D) is a pre-competitive partnership among the National Institutes of Health, industry and not-for-profit organizations, which is managed by the Foundation for the National Institutes of Health. Its mission is to make genetic association data accessible to the worldwide biomedical research community via the Type 2 Diabetes Knowledge Portal, in order to facilitate discovery of new targets for T2D treatment. But it can be a challenge to aggregate genetic data. The privacy of the individuals who contributed their health status and genomic sequences must always be protected, and there are many layers of regulation to ensure this. Restrictions at the institutional, regional, and national levels determine how data are handled and whether they can be transferred.
Until now, all of the results displayed in the Portal have been derived from data housed at the AMP T2D Data Coordinating Center (DCC) at the Broad Institute, where the Portal website resides. But some of the valuable data generated outside the U.S. cannot be transferred to the DCC. To address this issue, AMP T2D funded the development of a mechanism that enables researchers to interact with all of the data: federation.
Federation means that data are housed at a site (a “federated node”) that meets their specific privacy requirements, but are made available for remote queries via the Portal. Results from such queries are served up alongside results from all of the datasets housed in the AMP T2D DCC. Researchers may browse and query data from any location without even needing to know where they reside.
A federated node has now been created at the European Bioinformatics Institute (EBI) and may be accessed via the T2D Knowledge Portal. Today, Portal tools and interfaces can query both data housed at the AMP T2D DCC at the Broad Institute and data at the EBI federated node.
According to Paul Flicek, a Senior Scientist and Team Leader of Vertebrate Genomics at EMBL-EBI, “A key mission of EMBL-EBI is to make data available to the widest possible community. Seamlessly accessing stored in multiple locations via a single portal helps ensure that the data we store from many projects are maximally useful for additional research.”
The first dataset to be incorporated into the Portal via the EBI federated node is the Oxford BioBank exome chip analysis dataset, which contains association data for glycemic, lipid, and blood pressure traits from over 7,100 healthy subjects in Oxfordshire, U.K. The dataset is described on our Data page. Portal users can interact with this dataset in the same way (and with the same speed) as with other datasets.
“Diabetes is a global problem, and it will take research and innovation on a global scale if we are to tackle it effectively,” says Mark McCarthy, Robert Turner Professor of Diabetic Medicine at University of Oxford. “The success of our research on the genetics of diabetes depends on access to data generated by groups around the world. The federated portal provides an additional set of tools that will allow us to jointly analyse those data sets wherever they happen to be based.”
Federation represents both an important technical advance in handling and protecting data, and a significant step forward in democratizing and improving access to genetic association results. And because it is generally applicable to any kind of genetic association data, it has the potential to have an impact beyond T2D research, facilitating the study of other complex diseases and traits.