Wistar Scientists Uses Artificial Intelligence to Identify Viruses Related to Cancer
Some cancers are linked to viral infections. Studying viruses found in tumor cells can reveal important information in the development of more effective cancer treatments. Wistar researchers developed a tool to study the expression of cancer-related viruses through artificial intelligence. In a recent paper published in Nature Communications by Noam Auslander, Ph.D., assistant professor, Molecular & Cellular Oncogenesis Program, Ellen and Ronald Caplan Cancer Center, and her lab, created the technology called viRNAtrap as an innovative method that identifies viruses from human RNA sequences and rapidly characterizes viruses expressed in tumors.
Wistar discussed viRNAtrap and its creation with Dr. Auslander to find out more about how this novel technology impacts research on cancer and other viral diseases.
Q: What inspired this research to develop a new platform analyzing viral expression linked to cancer? Is this a one-time study or part of a larger project?
A: I have always wanted to investigate viruses that cause cancer or correlate with cancer outcomes. As a trainee I worked in computational labs that studied cancer or viruses (but not both) and used different tools for these studies. In my lab I incorporate those tools, allowing the development of this framework. This is a major research direction in my lab, and we have follow-up projects that are looking into related questions.
Q: What is viRNAtrap? How did you and your team come up with this name?
A: My postdoc Dr. Abdurrahman Elbasir and I came up with the name. It combines vi- (for virus), RNA (for RNA sequences), and trap (because we “trap” viral RNA sequences that are difficult to identify).
Q: What can viRNAtrap do?
A: It’s a software to identify viruses from short RNA sequencing reads – taking small fragments of the genome then assembling longer sequences of viruses that are expressed in a tissue.
Q: What were your methods in creating this framework? Were there any challenges that arose during the process?
A: As a postdoc I worked on an AI software to identify viruses, but this platform was based on longer sequences coming from a different technology. The read length was and is a major bottleneck for viRNAtrap. Dr. Elbasir managed to train a deep learning model — that’s a model that is built using neural networks that can distinguish viral reads from human reads fairly well using reads as short as 48bp. This model and the proof of concept that it could be built were critical for this research. Based on this model, we built the viRNAtrap framework that identifies viral reads and assembles longer sequences (contigs) from which known and new viruses can be characterized.
Q: How did you verify viRNAtrap works?
A: The model was validated and tested with an independent test dataset. The whole framework was verified using cases with known cancer viruses in the TCGA. We also had an experimental validation for one of the new viruses that we found in ovarian cancer, through a collaboration with Dr. Rugang Zhang’s lab, who verified that this virus is expressed in cell lines.
Q: Was there anything surprising that viRNAtrap detected?
A: There were a couple of very surprising viruses viRNAtrap detected, including some plant and insect viruses that were found in tumor tissues. The most notable of which was an insect virus that we found in 25% of endometrial cancer samples. If this association is real and not due to some unidentified contamination of the TCGA samples, this could be a very important discovery.
Q: How can this tool be used in biomedical studies to help prevent/combat cancer and other diseases?
A: We all know that viruses are a major health concern, and that they contribute to many diseases. However, viruses are really difficult to study with current sequencing technologies as they evolve rapidly and accumulate many mutations. Using this tool, we can identify new viruses in disease tissues even if they are divergent and mutated. We can therefore find viruses that drive or modulate diseases, which can lead to new diagnosis, vaccination, and treatment strategies.