Biostatistics & Bioinformatics

The biostatistics/bioinformatics program at Boz utilizes either publicly-accessible data or an original experimental dataset to introduce students to experimental design, biostatistics fundamentals, management of a large gene expression dataset, and basic bioinformatics concepts. Experimental data are based on environmental assays performed on developing invertebrate embryos and RNA Sequencing gene expression data from the animal physiology/aging project. Fundamental Biostatistics concepts covered during the first part of the project include experimental design, biological and technical replicates, statistical power, central limit theorem, Bayes’ rule/theorem, one-way Analysis of Variance (ANOVA), and pairwise comparison t-test. During the intermediate phase of the program, students focus on differential gene expression workflow, data normalization, principal component analysis (PCA), and introduction to Bioinformatics databases (GenBank, UCSC genome browser, David Bioinformatics, STRING) utilizing RNA Seq analysis (sequence alignments, gene finding, and differential gene expression analysis). During the final (third) phase, the team utilizes the Institute’s experimental and/or publicly available data to test hypotheses, determine trends, visualize data, and prepare/submit a manuscript for publication in a peer-reviewed scientific journal.

*For more about this project, contact


Flannery McLamb

Goran Bozinovic


Steven R. Head, Ph.D., The Scripps Research Institute – DNA Array Core and Next Generation Sequencing Core