Biostatistics/Bioinformatics program at Boz utilizes either publicly-accessible data or an original experimental dataset to introduce students to experimental design, biostatistics fundamentals, management of a large gene expression dataset, and basic bioinformatics concepts. Experimental data are based on environmental assays performed on developing invertebrate embryos and RNA Sequencing gene expression data from animal physiology/aging project. Fundamental Biostatistics concepts covered during the first part of the project include experimental design, biological and technical replicates and statistical power, central limit theorem and Bayes’ rule/theorem, one-way Analysis of Variance (ANOVA) and pairwise comparison t-test. During the intermediate phase of the program, students focus on differential gene expression workflow, data normalization, principal component analysis (PCA), and introduction to Bioinformatics databases (GenBank, UCSC genome browser, David Bioinformatics, STRING) utilizing RNA Seq analysis (sequence alignments, gene finding, and differential gene expression analysis). During the final (third) phase, the team utilizes Institute’s experimental and/or publicly available data to test hypotheses, determine trends, visualize data, and prepare/submit a manuscript for publication in a peer-reviewed scientific journal.

*For more about this project, contact


Flannery McLamb

Goran Bozinovic


Steven R. Head, Ph.D., The Scripps Research Institute – DNA Array Core and Next Generation Sequencing Core

W. Kelley Thomas, Ph.D., University of New Hampshire – Hubbard Center for Genome Studies