The Cancer Imaging Archive, managed by FNL's Cancer Imaging Informatics Laboratory, supports a research community that seeks to connect cancer phenotypes to genotypes. To accomplish this, the archive hosts data sets that connect clinical images with patient genomic data and proteomic data.   

The archive is part of National Cancer Institute programs that are collecting medical and pathological images matched to proteomic, as well as genomic, clinical, and pathological data.  

Clinical Proteomic Tumor Analysis Consortium

The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics. Data (genomics, proteomics, imaging), assays, and reagents are made available to the public as a Community Resource to accelerate cancer research and advance patient care.

The Cancer Imaging Archive has partnered with CPTAC to host both the radiology and pathology imaging data generated by the project: We have collected and hosts more than 1,600 patients’ histopathology images, 500 patients’ radiology images, and coordinates a special interest group to support cross-disciplinary research across imaging and omic data.  In addition, four CPTAC collections have been annotated and segmented through the NCI Cancer Imaging Program’s annotation initiative.

Cancer Moonshot APPOLO Network

The Applied Proteogenomics OrganizationaL Learning and Outcomes (APOLLO) Network provides genomic and proteomic screening to target therapies for up to 8,000 active service military personnel and veterans as part of a collaboration among the U.S. Department of Veterans Affairs, Department of Defense and National Cancer Institute.

The Cancer Imaging Archive has extended the network's data curation and collection capacity, integrating the data to the VA’s research Precision Oncology Data Repository, and to the DOD Defense Health Agency data cloud (in progress).

We have successfully supported multiple pilot data collection activities and is the imaging data collection and characterization portal for the APOLLO 5 prospective data collection.     

The archive is establishing first-of-its kind enterprise clinical imaging de-identification and sharing systems, including digital pathology data sharing, within and between the network's collaborators. Imaging from the Department of Defense, Veterans Affairs, and participating civilian sites are posted to the Cancer Imaging Archive. The archive is currently hosting data from >330 APOLLO-enrolled patients and will be mined for clinically-relevant information in combination with APOLLO proteogenomic findings. 

Cancer Moonshot Biobank

The Cancer Moonshot Biobank collects biospecimens and associated data across at least 10 cancer types from at least 1,000 diverse patients, who represent the demographic diversity of the U.S. and are receiving standard of care cancer treatment at multiple NCI Community Oncology Research Program sites. 

We are making de-identified radiology and histopathology images collected from Biobank patients available on the Cancer Imaging Archive. Associated genomic, phenotypic, and clinical data will be hosted by the database of Genotypes and Phenotypes (dbGaP) and other NCI databases.

Cancer Genome Atlas  

The archive also collected and hosts radiological imaging from The Cancer Genome Atlas (TCGA), along with the results of characterization and analysis work done by the collaboration's imaging phenotype research groups. 

Quantitative Imaging Network (QIN) Support 

We facilitate data sharing among the NCI Cancer Imaging Program’s Quantitative Imaging Network (QIN).

Eleven QIN collections are currently hosted on the Cancer Imaging Archive, and that number is expected to grow with the network activities. In several instances, this data sharing is supporting cross-institutional algorithm validation bilaterally or as part of pilot challenges. 

National Lung Screening Trial (NLST) Data Portal 

An additional use of the archive has been its availability to absorb and join images from both arms of the NLST trial from the American College of Radiology Imaging Network (ACRIN) and the Lung Screening Study group. TCIA hosts the full NLST image set with clinical metadata and pathology images, along with a specially developed query tool that supports filtering on associated clinical data parameters.  

NCI National Clinical Trials Network (NCTN) Support 

NCTN clinical trials help to establish new standards of care, set the stage for approval of new therapies by the Food and Drug Administration, test new treatment approaches, and validate new biomarkers.  TCIA has published 24 datasets from the NCI Clinical Trials, most with links to extensive clinical data.  Half of those trials were funded through an NCI initiative begun in 2019 to expand its data collection services to support the NCTN. Imaging data associated with NCTN trials were  centralized and de-identified under a subcontract with the Imaging Radiation Oncology Core (IROC). The TCIA team documented its curation procedures and trained IROC staff to apply them. Trial image datasets are hosted on  TCIA for final review and linked with clinical data hosted in the NCTN/NCORP Data Archive.   

Childhood Cancer Data Initiative (CCDI) 

TCIA has taken proactive steps to support CCDI’s goal to “build a community of pediatric cancer researchers, advocates, families, hospitals, and networks committed to sharing data to improve treatments, quality of life, and survivorship of every child with cancer.“  TCIA hosts 7 clinical trial datasets from the NCTN Children's Oncology Group, and tumor annotations for each trial are being generated and published.  TCIA is engaged with the CCDI Data Catalog team to ensure TCIA datasets are discoverable as they’re added. 

Preclinical Imaging Support 

TCIA also hosts high value preclinical image collections. These include studies of specialized phantoms, devices that permit standardization of quantitative imaging parameters across instruments and site, as well as imaging studies of patient-derived xenograft models from the NCI Patient-Derived Models Repository and canine clinical trial data from the Canine Immunotherapy Trials Network

Annotated Data for AI and machine learning

NCI recently launched an initiative to annotate NCI trials with tumor segmentation labels and seed points to stimulate development of AI segmentation models. Supervised machine learning (ML) algorithms require labeled data for algorithm training and validation. CIP continues to improve its support for Artificial Intelligence and Machine Learning based research by prioritizing new TCIA data collections that provide annotations and labels that can be leveraged for algorithm development and funding the expert annotation of existing TCIA data. Currently, 69 TCIA data collections include detailed segmentations, of which 17 were generated through CIP annotation funding.

COVID-19 

To support the urgent public health need to have COVID-19 image data for all disease stages freely available for caregivers and the research community as soon as possible, the NCI Cancer Imaging Program (CIP) provided TCIA as a resource for making image sets available to the public since it was uniquely ready to carry out a short term effort to collect, curate and host COVID-19 patient images for immediate reference by the community.   TCIA hosts six comprehensive COVID-19 datasets in its public archive.