Sequencing Facility uses cutting-edge technology to make the old new again

Just a few years ago, it was difficult to get any meaningful sequence data from formalin-fixed, paraffin-embedded (FFPE) patient samples.  

These samples are easy to make and store, so there is a high number of these available. But RNA and DNA degrade over time, with RNA degrading much more easily, making it difficult to produce meaningful data from old or poorly stored samples. Even if FFPE samples are stored under the best conditions, older samples are likely to be unusable for typical sequencing methods. 

Now, the Sequencing Facility, a dedicated service for the National Cancer Institute’s Center for Cancer Research headed by Bao Tran, has developed and shared a clear methodology and outlined quality control (QC) metrics for getting the most information out of such ubiquitous, highly degraded samples.  

Mining for meaningful data in available samples 

Two things came together that allowed the team to make these important developments. First, they had an opportunity to beta-test a new sequencing library preparation kit. Around the same time, NCI's Danielle Mercatante Carrick, Ph.D., approached them with interest in getting RNA sequence data from 67 FFPE samples (60 different samples, plus seven replicates) of ovarian cancer.  

Once they found a process that showed encouraging results, they processed the rest of the samples and provided usable information to the researcher. Later, the team published their methodology, along with a video, in JoVE, so others in the sequencing community can follow the same steps to easily produce data from such samples. 

“Five or six years ago, we couldn’t even do anything with [degraded FFPE] samples because the technology and the protocols [were] not evolved enough. But since [then] … [the technology has improved enough so we’ve] been able to work with lower and lower inputs of DNA and RNA,” said Jyoti Shetty, the Sequencing Facility’s Illumina laboratory manager.  

Indeed, next-generation sequencing technology and modified protocols have enabled the Sequencing Facility to get meaningful information from samples that, in the past, would not have generated any information at all. And that’s good news, because DNA and RNA sequencing can provide genome-variation and gene-expression data that is essential for studying disease mechanisms.  

“Over the last few years, sequencing has really kind of exploded in the scientific community,” said Monika Mehta, the Sequencing Facility’s research and development (R&D) manager. She added that “the big advantage [of using FFPE samples for research] is the number of available samples. … The more patients’ tumors you look at, the [higher] the probability that you will identify the molecular mechanisms behind different cancers to identify new therapeutic targets or identify markers for quick diagnosis.” 

Work with what you have and adapt as you go 

Monika Mehta, research and development manager; Jyoti Shetty, Illumina laboratory manager; and Yongmei Zhao, bioinformatics manager, from the Sequencing Facility stand together in front of a scientific poster.
From left to right: Monika Mehta, research and development manager; Jyoti Shetty, Illumina laboratory manager; and Yongmei Zhao, bioinformatics manager, from the Sequencing Facility. Image by Mary Ellen Hackett.

To develop their new methodology, the team altered their standard protocols to work for non-standard samples. While the methodology was there, it wasn’t developed for use with such low-quality samples. This process involved some trial and error, beginning with just a few samples from the group’s larger sample set.  

“We kind of put things together. We developed our own metrics … at three different starting points: at the initial sample QC level, at the library QC level, and also… some data QC metrics,” said Shetty. 

One big thing they changed was in the sequencing library preparation step. Instead of using the typical method, they tried using a whole-transcriptome method that captures both coding and non-coding RNA and ultimately makes it more likely to detect useful information. 

“These are gene-expression studies, so all you need is a stretch of sequence that can unambiguously identify the transcript that they’re coming from. As long as you can generate that much sequence, you can identify which gene it came from, and that is good enough because you get the information you are looking for,” said Mehta.  

The project required extensive collaboration between all teams in the Sequencing Facility (R&D; Illumina sequencing; and bioinformatics), as every part of the sequencing process was tweaked in order to optimally sequence the degraded RNA samples. Fortunately, the team is used to collaborating closely with each other, other Principal Investigators, and even sequencing cores outside of Frederick National Laboratory.  

“We call it a community effort and share the knowledge we have learned. And we also learn from each other in order to develop best practices in next-generation sequencing analysis,” said Yongmei Zhao, the Sequencing Facility’s bioinformatics manager.  

Innovating into the future 

The opportunity to beta-test a new kit and try something that hadn’t been done before at this scale has allowed the Sequencing Facility team to remain on the cutting edge of cancer research.  

“We thought it was an exciting opportunity … [for] many other investigators to expand their research. … This has opened the door to many more projects for our group,” said Shetty. 

And, since taking on this project and publishing their methodology paper, working with FFPE samples has become more routine. They estimate having processed hundreds of FFPE samples for DNA or RNA for a dozen or more projects since publishing their methodology.  

But they haven’t stopped there. The group continues to improve sequencing and analysis methods and test new methods through the R&D team. In the future, Mehta hopes they can offer a method using Cas9, part of the CRISPR gene-editing tool, to target specific sequences. 

“We are trying to optimize that method for the platforms that we have available in the Sequencing Facility for long-read sequencing. It’s an ongoing project, but we are very hopeful … especially in cancer, you have a lot of gene rearrangements, and it’s not easy to detect them because it is expensive,” said Mehta. Targeting one specific part of the sequence could save investigators money that could be spent sequencing more samples or toward a different project.  

Finding ways to support investigators is part of the group’s mission to enrich cancer research, so using the latest technologies and finding new applications for their capabilities is built into their workflow. 

“We never stop. We always … continue to evolve,” said Zhao.