Wistar’s Dr. Avi Srivastava seamlessly integrates elements of computer science and traditional biology into his new computational biology research lab. By combining wet and dry lab approaches — experimental biology and computational data — he can be innovative in research derived from both worlds. Here’s how he does it.
What is computational biology?
Computational biology is different for different people. For me, the fine line between bioinformatics and computational biology is a live question, but we can think about computational methods intersecting with biology in two basic ways.
The first element is existing methods. These are open-source tools that exist on the internet as downloadable, which can then be applied to data and generate something useful. When scientists talk about “mining” datasets, they’re using tools like this on a particular dataset. This large area of research takes a good deal of resources, and many scientists interact with computational elements of research on this level.
Let’s define open source. Open source is freely accessible. An example might be an algorithm that can analyze RNA data in bulk to look for a particular pattern associated with something, maybe a disease biomarker. Scientists can simply download that algorithm, execute it on the dataset they’re interested in, and interpret the results. So, using these existing methods to answer questions about biological data is the first component of computational biology.
The second component would be the actual development of those tools. Someone has to develop them, right? And I relish in developing new methods; that’s who I am. Every software method in the field of biology needs to be informed biologically, through experimentation. That’s how you make these tools better. It’s not just sitting in your room on your laptop coding for hours — it’s getting in the lab to understand how the biology behind the code works.
Once you understand the tools’ foundations and limitations, you can modify lab experiments and refine methods. This process complements itself through experimenting, collaborating, and refining. Computational biology loops from lab to code to lab — a virtuous circle that continues to improve, because the field moves so fast.
How long has computational biology existed and when did it emerge as a field?
I think that computational biology grew out of computer science. Now, computer science has been around for ages; we can go all the way back to Turing machines, or even further. But I think that the Human Genome Project in the 1990s really opened a lot of scientists’ eyes to the power of combining computer science with biological research.
The Human Genome Project developed enormous data sets, and back then, the sequencing technologies weren’t advanced enough to sequence long segments of DNA. So, scientists began to ask themselves how they could chop up the human genome into segments of DNA for sequencing and then reassemble the human genome from those chunks. To do that, they turned to computer methods.
Think of it like file compression, when you email a picture and it loses some image quality: that’s what scientists did to the human genome, and I believe that’s the time computational biology came into its own. Since then, the field has matured and tech has improved, and our ability to “see” more of the genome has improved too, because we can process more data.
Software and code can change very quickly. How do you stay up-to-date on all the new developments in the field?
Staying current in the field is one of the million-dollar questions in computational biology, and I don’t know that anyone has cracked it. Because with software and open-source code, things do move fast, and scientists want to use the best, latest methods to answer their research questions.
In my experience, you have to orient your lab around reading papers efficiently. Rather than spending an hour on every paper and discussing it in-depth, I like the setup of my previous boss where I have my lab discuss four papers in an hour when reviewing the literature. In general, we keep 15 minutes a paper to get a broader sense of the method, what’s new, and how we can learn from it — and then we selectively discuss relevant papers in-depth. It’s not a perfect solution, but it helps you get a broader perspective of the way the field is growing.
The pace of the science makes computational biology exciting, in part because the changing tech is a challenge in its own right, and scientists like me love a good challenge.
What do you see as the Srivastava Lab’s role in such a dynamic research landscape?
Computational biology methods should be adjustable and easy to use, but I think the field needs better tools and better support for those tools. When I say “support,” I mean that if I download software and it doesn’t work except in one specific circumstance, then that tool has very limited use for the broad scientific community. It’s a big problem when some papers can’t even be replicated using the same methods because of a lack of support from the developers. With well-supported tools, researchers can utilize the method effectively, which is necessary for reproducing and verifying results.
Yes, we need to make sure programs and tools work properly, but providing support when building them allows diverse applications by allowing scientists to adapt these tools to their own research questions. When programs are tweaked and iterated upon, scientists can get creative and research flourishes — but that can only happen if those tools are built in a way that lets scientists tweak them easily.
I will support those kinds of innovative alterations in the way I go about developing tools, but also by keeping the user in mind and providing tutorials, instructional PDFs, videos, etc. That takes time, but if you’re invested in your methods, providing that support can make them even more impactful.
What excites you about the field and your work in it?
We talked about this computational biology “loop” — code feeds into the wet lab, which feeds into the code, and so on. I’m very excited to be at Wistar working on both sides of that loop; many scientists focus on one or the other, but my lab is focused on bringing both the lab and the code to the fore simultaneously.
That’s an exciting space to be in because we have so much room for interdisciplinary discovery and collaboration. I can work with computer scientists who want to learn more about biology, and I can work with biologists who want to learn more about coding. My lab is interested in the epigenome — how the genome is modified — because it’s important for so many different processes and disease states across cell types. By focusing on the computational and the biological, I think we have a tremendous opportunity to build tools that will give us a more detailed understanding of the epigenome’s complexities.