Extreme-Scale Computing Project Aims to Advance Precision Oncology

Two government agencies and five national laboratories are collaborating to develop extremely high-performance computing capabilities that will analyze mountains of research and clinical data to improve scientific understanding of cancer, predict drug response, and improve treatments for patients.

The cross-agency collaboration includes the National Cancer Institute and the Department of Energy, along with five national labs—Argonne, Oak Ridge, Lawrence Livermore, Los Alamos, and Frederick—and is part of the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program.

The computing project, known as the Cancer Distributed Learning Environment (CANDLE), is creating the capability to use cancer data to build predictive models for drug response, provide better molecular understanding of disease growth, and support decisions on individualized treatment, all while furthering the Precision Medicine Initiative, National Strategic Computing Initiative, and the Beau Biden Cancer Moonshot℠.

CANDLE will be implemented as a widely accessible open-source computer environment that individual cancer researchers across the country will be able to install on their own systems. The easy-to-access program will encourage development of next-generation computer simulations that can be used to better understand biological processes in cancer and, ultimately, to predict which drugs would be most effective against which cancers.

“The early uses will likely be for studying the complexities of cancer, with potential for extending impact into support for precision oncology treatment decisions,” said Eric Stahlberg, Ph.D., director, Strategic and Data Science Initiatives in the Data Science and IT program at Frederick National Lab (FNL).

CANDLE is more than a platform for sharing. The system is programmed to “learn.” The software is designed to detect complex patterns in large data sets that may be invisible to researchers. This is called deep learning. The software will link this new information to known concepts, thereby extending the scientific understanding of processes involved in cancer.

Each cancer model’s parameters can be continually adjusted to bring the machine’s predicted responses closer to observed responses. The computer predictions will be validated by scientists to ensure accuracy. The ability to explore a broad range of models and data enables CANDLE to identify potential novel solutions and insights in unanticipated areas.

“The research community has collected thousands of experiments with hundreds of thousands of data points characterizing tumors and their response to the drugs,” said Rick Stevens, an associate laboratory director at the DOE’s Argonne National Laboratory and professor of computer science at the University of Chicago. “By working with the national laboratories, the National Cancer Institute can now use the computing resources of the national labs to build scalable predictive models for the cancer problem.”

A significant milestone in the use of predictive models is evident in one of the largest precision medicine clinical trials in the nation, the NCI-Molecular Analysis for Therapy Choice (NCI-MATCH) led by ECOG-ACRIN (part of NCI’s National Clinical Trials Network). The trial incorporates expertly defined algorithms to help match the genetic mutations found in the tumors of individual patients with drugs available to target those mutations, and to do so accurately and rapidly.

The deep learning approach underlying CANDLE builds upon this important advance, enabling an even greater range of potential models and data to be incorporated into the development of predictive models.  Already deep learning has enabled recent advances in identifying skin cancer, improving breast cancer diagnosis, and predicting mutation rates in prostate cancer, Stahlberg said.

CANDLE will also build upon emerging data resources being initiated by NCI and supported by the Frederick National Lab, such as the genomic data repository at the University of Chicago called the Genomic Data Commons, which supports the NCI cancer moonshot.

CANDLE will ultimately enable scientists to analyze information from many sources to look for key cancer biomarkers, molecules in the bloodstream that indicate the presence of the disease, and other information that may predict treatment response, Stahlberg said. The scalability of the system is an important aspect of the DOE design for CANDLE, with the ability to accommodate the analysis of millions of clinical patient records that can aid development of databases of disease metastasis and recurrence.

The Frederick National Lab has installed and tested early versions of CANDLE. Planned incremental releases of the software will enable broad use of the technology and allow scientists and researchers to provide insight and feedback to guide future development of the software.

“The exciting potential behind the development of CANDLE is the emerging set of unique and creative ways in which the deep learning capability will be applied in and shared among the cancer research community,” Stahlberg said.

By Kaylee Towey, WHK Student Intern