The Cancer Imaging Informatics Laboratory is a support team that manages the Cancer Imaging Archive with the objective to increase public availability of high-quality cancer imaging data sets for research, support National Institutes of Health data sharing requirements for the cancer imaging community, enhance reproducibility in research, and create a culture of open data sharing and collaboration among cancer imaging researchers. 

Our laboratory also supports the development of new technologies and methodologies, such as clinical imaging data de-identification and curation, radiomics and image characterization, AI and deep learning, and integrative, multi-disciplinary data analysis (e.g. radiogenomics). 

 

Scan of patient with colorectal liver metastases
Researcher Resource

Access the Cancer Imaging Archive

The Cancer Imaging Archive de-identifies and hosts a substantial archive of medical images of cancer, which are accessible for download.

Managing the Cancer Imaging Archive 

Cancer imaging research requires access to large, standardized, purpose-built imaging collections. Since 2010, the NCI Cancer Imaging Program has counted on our Cancer Imaging Informatics Laboratory to develop, manage, and support the Cancer Imaging Archive (TCIA) to fill the unmet needs of cross-disciplinary image researchers for open access to clinical images.    

We provide project management, data curation, data submitter and community relations outreach, and subcontract management.   

Each month, over 20,000 unique users visit the archive where they find more than 200 datasets of computed tomography, magnetic resonance imaging, positron emission imaging, x-ray mammography, digitized histopathology slides, and radiation therapy planning imaging studies.  

There have been at least 1,800 peer-reviewed publications based upon these TCIA-hosted data, with more likely as most of the collections are open and available for public use.   

In addition to supporting the imaging components of major National Cancer Institute data collection initiatives, we lead an advisory group that prioritizes the curation and publishing of researcher-initiated proposals based on how well the data sets fill data gaps to support critical current research for a clinical need, novel/unique datasets, research reproducibility, and investigation of biological hypotheses or other proposed discoveries about the pathophysiological basis of cancer. 

200+ datasets
1,800 publication citations
20,000 unique users per month
The Cancer Imaging Archive
Additional Content

Submission and de-identification 

The Cancer Imaging Archive provides full research-focused de-identification services and makes its tools and knowledge base available to the scientific community. Since the Cancer Imaging Archive contains a large repository of open-access clinical imaging data, protection of Private Health Information while still preserving the scientific utility of the data is critical.  

We have developed robust tools and extensive procedures to transmit, de-identify, and quality assess the medical images submitted to the archive and is staffed with curation experts who review and publish the submitted images. We routinely perform further refinement and testing of advanced, standards-based tools to enable more efficient de-identification of medical image data for public consumption.  

Crediting data generators for data and for data reuse 

We freely provide standards-based Digital Object Identifiers (DOIs) for each of the Cancer Imaging Archive’s data collections and to researchers using customized data cohorts to enhance research reproducibility and validation, as well as to encourage data submissions from academic researchers. 

The DOIs are frequently used to reference data in peer-reviewed publications, support data-use tracking, and provide authorship citations for use in academic CVs. 

Additional Content

A resource for the global cancer imaging community 

The Cancer Imaging Archive has become a vital resource known throughout the global cancer imaging community, having collected data from over 112 institutions and having served over 1.1 million users from 224 countries and regions. On average, users download more than 2.5 petabytes of data annually.  

The archive is a data publisher and recommended repository for Nature, PLOS One, Medical Physics, Elsevier and other leading journals, and over 1,900 peer-reviewed publications that leverage TCIA data have been indexed.  

We provide regular updates on social networks and hosts a wide variety of TCIA-centric sessions during annual meetings of the Radiological Society of North America to stimulate interest and cross-fertilize ideas. We publish a TCIA newsletter distributed to 8,000 recipients each month. 

Additional Content

Additional Content

Imaging-proteogenomics research support 

The Cancer Imaging Archive supports a research community that seeks to connect cancer phenotypes to genotypes. To accomplish this, the archive hosts data sets that connect clinical images with patient genomic data and proteomic data.   

The archive is part of National Cancer Institute programs that are collecting medical and pathological images matched to proteomic, as well as genomic, clinical, and pathological data.

We provide leadership, expertise, and imaging data support to National Institutes of Health program activities, including:

As we work on expanding the Cancer Imaging Archive's offerings, we are also trying to expand data sharing capabilities in many of the initiatives we collaborate on.

The archive has been able to absorb and join images from both arms of the NLST trial from the American College of Radiology Imaging Network (ACRIN) and the Lung Screening Study group.

We are establishing first-of-its kind enterprise clinical imaging de-identification and sharing systems, including digital pathology data sharing, within and between the APOLLO Network's collaborators. We are participating in National Cancer Institute efforts to create a Cancer Research Data Commons infrastructure. 

Our team also participated in the National Institutes of Health’s COVID-19 pandemic response by providing researchers with five SARS-CoV-2 datasets. 

Additional Content

Our capabilities and specializations

Additional Content

Supporting the imaging research community 

Our team ensures the research community has the tools and components to use the archive of medical images to its fullest. This includes adding labeled elements to imaging datasets, which scientists can use to develop automated image-analysis approaches.  

Additional Content
  • Design and implement analysis and annotation projects 

  • Promote best practices for sharing scientific data within the research community 

  • Support imaging data sharing in National Cancer Institute grant research networks 

Additional Content

Developing new technologies and methods 

While managing the publicly accessible resource, we support innovation to enhance The Cancer Imaging Archive and its uses for researchers. 

Additional Content
  • Radiomics 

  • Image characterization 

  • Artificial intelligence and deep learning 

Codex imaging of HCC
New collection

CODEX imaging of hepatocellular carcinoma

This high-dimensional data set allows studies into pathophysiological immune cell interactions for liver cancer.
Patient-derived xenograft of adenocarcinoma-pancreas
New Collection

PDMR-Texture-Analysis

This collection has imaging data from 175 mice for researchers to develop algorithms using neural networks.