Broad Institute's Genome Analysis Toolkit, or GATK, will be offered as a service on the Google Cloud Platform
Broad Institute of MIT and Harvard are teaming up with Google Genomics to develop a computing infrastructure that will help store and process enormous datasets, as well as create tools to analyze such data in biomedical research.
As a first step, Broad Institute's Genome Analysis Toolkit, or GATK, will be offered as a service on the Google Cloud Platform, as part of Google Genomics. The goal is to enable any genomic researcher to upload, store, and analyze data in a cloud-based environment that combines the Broad Institute's best-in-class genomic analysis tools with the scale and computing power of Google.
GATK is a software package developed at the Broad Institute to analyze high-throughput genomic sequencing data. GATK offers a wide variety of analysis tools, with a primary focus on genetic variant discovery and genotyping as well as a strong emphasis on data quality assurance. Its robust architecture, powerful processing engine, and high-performance computing features make it capable of taking on projects of any size.
GATK is already available for download at no cost to academic and non-profit users. In addition, business users can license GATK from the Broad. To date, more than 20,000 users have processed genomic data using GATK.
The Google Genomics service will provide researchers with a powerful, additional way to use GATK. Researchers will be able to upload genetic data and run GATK-powered analyses on Google Cloud Platform, and may use GATK to analyze genetic data already available for research via Google Genomics. GATK as a service will make best-practice genomic analysis readily available to researchers who don't have access to the dedicated compute infrastructure and engineering teams required for analyzing genomic data at scale. An initial alpha release of the GATK service will be made available to a limited set of users.