Research Projects

The ongoing studies of Davuluri laboratory are summarized below.

Research Goal 1: Informatics platform for mammalian gene regulation at isoform-level        

The recent advances in Next-Generation Sequencing (NGS) brought the field closer to the goal of studying gene expression and regulation at isoform-level by facilitating massive scale sequencing at much lower costs.  The advent of NGS based molecular technologies, such as ChIP-seq and RNA-seq, is enabling genome-wide identification of gene isoforms and the target genes of transcription factors (TFs) in different cells, tissues and disease conditions. However, there are a number of new informatics challenges and difficulties that must be investigated to improve the current state and fulfill the promise of studying gene regulation and expression at gene isoform-level.

Our lab develops novel algorithms and informatics platforms for understanding gene regulation at isoform-level (alternative promoter or alternative transcript-level) by developing statistically rigorous bioinformatics applications for processing NGS data. Using state-of-art pattern recognition and statistical inference methods, we have been developing accurate prediction algorithms for TF binding sites, Pol-II promoters, and transcriptional modules from ChIP-seq/chip data, and isoform level gene expression estimation from RNA-seq data. These methodologies will be extended to develop novel algorithms for integrative analysis of multiple NGS datasets currently generated across different laboratories, including our lab. The development of these informatics methods and user-friendly intuitive software will provide useful tools to better understand gene regulatory mechanisms in mammalian cells at isoform-level, and more importantly, how dis-regulation of these mechanisms leads to a variety of diseases.  

Research Goal 2: Isoform-level gene regulatory networks in brain development and brain tumors

We have recently built a genome-wide inventory of non-coding and protein-coding transcripts and its variants (transcriptome), their promoters (promoterome) and histone modification states (epigenome) for developing and adult mouse cerebellum using integrative massive-parallel sequencing and a  bioinformatics approach. The data consists of 61,525 (12,796 novel) distinct mRNAs transcribed by 29,589 (4,792 novel) promoters corresponding to 15,669 protein-coding and 7,624 non-coding genes. Importantly, we found that the transcript variants from a gene were predominantly generated using alternative transcriptional and not splicing mechanisms, highlighting alternative promoters and transcriptional termination as major sources of transcriptome diversity.  Moreover, majority of genes associated with neurological diseases expressed multiple transcripts through alternative promoters, and we demonstrated aberrant use of alternative promoters in medulloblastoma, a malignant brain cancer of the cerebellum. The transcriptome diversity of developing and adult cerebellum emphasizes the importance of studying mammalian gene regulation and function at the isoform-level rather than simply at gene-level.  We hypothesize that the isoform-level gene expression profiling will lead to significantly improved classification of molecular sub-types among apparently similar (patho)phenotypes, and identification of more specific gene regulatory modules and pathways that are disrupted in each disease sub-type.

Our goal in this study is to discover the interdependencies between the molecular components at gene-isoform level rather than at gene-level, and identify the disruptions in isoform-level regulatory modules in gliomas, tumors of brain, by a multi-disciplinary approach. We will map isoform-level transcriptome in glioma tumor samples (in collaboration with Dr. Donald O'Rourke, Neurosurgeon, University of Pennsylvania) by RNA-seq, develop a knowledgebase of mammalian brain transcriptome, and discover and validate (iii) the isoform-level gene regulatory network (GRN) modules and core pathways specific to intrinsic molecular sub-types of heterogeneous gliomas by applying mathematical modeling tools.  


Research Goal 3: Genomics-based informatics platform for personalized GBM therapy       

Glioblastoma multiforme (GBM) is the most common malignant central nervous system tumor, accounting for nearly 80% of the 22,000 malignant brain tumors diagnosed annually in the United States. Patients diagnosed with GBM receive a dismal prognosis; the average survival if left untreated is 3 months. Standard treatment increases median survival to 12 months, but the 2-year survival rate is less than 25%.  To date, a major challenge in the glioblastoma field is to more effectively predict prognosis in these patients; such information is absolutely essential to devising an optimal treatment plan for this tumor. The hypothesis to be tested in this proposal is that isoform-level gene expression signatures, combined with the microRNA, known mutation and SNP profiles, will lead to significantly improved classification of molecular sub-types of GBM.  By identifying these expression, SNP and mutation signatures, we posit that we will also be able to delineate specific gene regulatory modules and pathways that are critical in the etiology and progression of each GBM molecular sub-type.

We are currently refining and validating this multi-analyte molecular sub-type classification scheme in both retrospective and prospective cohort of GBM patient samples, by collaborating with Dr. Donald ORourke, Associate Professor, Penn Neuro-oncology program. Our goal is to develop and test a multi-analyte Computerized Clinical Decision Support System, based upon data from the molecular signatures, for personalized brain tumor therapy. 

These synergistic research projects of our laboratory will not only pave a new way to understand the molecular mechanisms involved in the pathogenesis of human diseases, but will also lead the research effort to develop genomics-based clinical decision support systems for personalized medicine. Understanding the isoform-specific gene regulatory networks in stratified patient populations is clearly central if we are to develop rational and mechanistic therapies for complex genetic diseases that arise from the accumulated contributions of many gene–gene interactions.