Biostatistics

The Biostatistics faculty include: Paul Auer, Youngjoo Cho, Chiang-Ching (Spencer) Huang, and Cheng Zheng. The Biostatistics program includes the Laboratory for Public Health Informatics and Genomics, for which Peter Tonellato is PI. Learn more about featured research projects by the Biostatistics faculty:

Detection of Colorectal Cancer Susceptibility Loci Using Genome-Wide Sequencing

Ulrike Peters, Li Hsu, Debbie Nickerson, Suzanne Leal, Goncalo Abecasis, Paul Auer

This multidisciplinary projects aims to investigate whether different types of genetic variants, including rare and structural variants, influence colorectal cancer risk in humans. Specifically, we will examine variants across the entire genomes of colorectal cancer cases and controls to identify new genetic risk factors for colorectal cancer, and investigate whether known environmental risk factors for colorectal cancer modify genetic susceptibility to this disease.

Dynamic molecular network of immune system in cardiovascular diseases

Chiang-Ching Huang, Taura Bar, Reyna VanGilder

Atherosclerosis is the main cause of cardiovascular disease (CVD), the number one cause of death in the world. Increasing evidence shows that both innate and adaptive immune systems tightly regulate atherogenesis. Several immune molecules have been suggested to play a critical role in the inflammatory process of atherosclerosis. However, the fundamental knowledge of dynamic immune regulation in atherosclerosis is far from complete. This project addresses this gap in knowledge by investigating the transcriptional network structure of two major innate and adaptive immune pathways, toll-like receptor and T-cell receptor signaling in atherosclerosis, myocardial infarction (MI), and ischemic stroke (IS). A parallel comparison of transcriptional patterns across these physiopathological conditions will shed light on how these two immune systems interact to influence disease progression and identify patients at a higher risk for developing MI or IS.

Metabolomics risk score for near-term CVD events in individuals with PAD

Chiang-Ching Huang, Mary McDermott, Kiang Liu, Jane Tseng

Compared to individuals without peripheral arterial disease (PAD), those with PAD have a nearly two-fold increased risk of all-cause mortality and two- to three-fold increased rate of acute coronary syndrome (ACS), even after adjusting for cardiovascular disease (CVD) risk factors and comorbidities. To date, there is no robust classification system to discriminate high-risk (e.g., PAD) patients who are more likely to suffer near-term mortality or ACS events from those who are less likely. Since established risk factors discriminate near-term risk poorly, identifying novel pathways that may signal near-term ACS events is expected to improve our discrimination ability and understanding of the pathogenesis of ACS events. The objective of this project is to develop a multi- metabolite classification system for near-term ACS events in patients with PAD. This study will use high sensitive metabolomics/lipidomic techniques to systematically identify metabolic pathways and metabolites associated with near-term ACS events.

The NHLBI Exome Sequencing Project

The ESP Consortium (Paul Auer, contributing member)

The goal of the Exome Sequencing Project (ESP) is to discover novel genes and mechanisms contributing to heart, lung and blood disorders by pioneering the application of next-generation sequencing of the protein coding regions of the human genome across diverse, richly-phenotyped populations and to share these datasets and findings with the scientific community to extend and enrich the diagnosis, management and treatment of heart, lung and blood disorders

Robust taxanomic development using 16s rRNA pyrosequencing fragments

Charles J Murphy, Ryan Newton, Sandra McLellan, Peter J. Tonellato

Next generation sequencing technology, such as pyrosequencing, can generate large sequence datasets to estimate bacterial communities in biological samples. Pyrosequencing often uses specific genomic regions, such as the 16s rRAN gene, as a stable taxonomic markers. The primary analysis is to estimate bacterial communities in pooled biological samples, but is complicated with the consideration of variable length sequence reads, which poses the technical problem of correlating taxonomies between older technology data (shorter sequence reads) and newer technology data (longer sequence reads); where longer sequences and shorter sequences have overlapping regions. Methods to correlate bacterial communities between longer and shorter sequences are actively being addressed. Presented here is the Hybrid Analysis (HA) that estimates bacterial communities in pooled samples containing variable length sequencing fragments. Initial testing of the HA algorithm are promising; further testing is required.