The National Cancer Institute Division of Cancer Treatment and Diagnosis appointed the Frederick National Laboratory to build the Integrated Canine Data Commons. The data commons provides a publicly available cloud-based repository of data on spontaneously arising canine cancer to advance research on human cancers by enabling comparative analysis between human and canine cancers.  

The Center for Technical Operations Support Group within the Bioinformatics and Computational Sciences directorate built the portal to curate, harmonize, and ingest canine cancer data into an extensible and integrated data model. 

The data model enables users to build data-driven cohorts that facilitates downstream analysis in the Seven Bridges Genomics Cancer Genomic Cloud, the partnering cloud resource platform.  

The Integrated Canine Data Commons brings together data from multiple programs and studies, with a particular focus on data generated from pet dogs. Naturally occurring cancers in dogs show clinical and biological similarities to human cancers, and their treatment and medical care are similar. There are approximately 70,000 annual cancer cases in companion dogs, enough for canine clinical trials to evaluate the large number of novel drugs and drug combinations. 

Key data types include whole-exome sequencing, whole-genome sequencing, RNA-Seq, and DNA methylation files. The ICDC Data Governance Advisory Board helps to identify and evaluate canine data sets that have the potential to drive discovery and provide value to the broader community. 

Recent accomplishments and recognitions include: 

  • Mentioned in a Washington Post article, "Dog cancer research advances pursuit of drugs for humans and canines.”  

  • Featured in an episode of the TV program 60 Minutes, “Dogs may hold key to treating cancer in humans.”  

  • Updated the model viewer, Data Model Navigator, which now allows download of data loading templates, ICDC controlled vocabularies, and an example set of data loading files.   

  • Expanded the functionality of the JBrowse sequence viewer to allow users to simultaneously view multiple sequence analysis files side by side.  

  • Onboarded new data resulting in 11 sets of data representing more than 670 dogs, nearly 2,000 case files, and more than 35 terabytes of data.