Statistics is a mathematical science including methods of collecting, organizing and analyzing data in such a way that meaningful conclusions can be drawn from them. The data are presented in the form of tables and graphs. The characteristics of the data are described in simple terms.

IntroductionThis course is for researchers who routinely work with data and need to analyze and visualize it in a statistics package (and Excel is not providing them with that capability). Basic statistics and plotting of data with the statistical programming package R is covered. Attendees will learn how to implement summary statistics, hypothesis tests and statistical techniques such as regression, PCA, t-tests, ANOVA, etc. Automated biological data analysis, development of algorithms, and workflows will also be covered

Why learn Statistics?Statistics is the language for summarizing and reporting on data. It provides a robust framework to provide guidance and advice based on analysis of experimental data.

Logistics:This online course is divided into 8 sessions with pre-recorded videos, handouts, reference cards, examples, data, scripts and quizzes. Enrollees can contact the instructor with questions and get help on the projects. The main topics are listed below, but we teach mostly everything there is to know about using statistics for Bioinformatics. Homework assignments will involve running commands learned in the live lectures.

Pre-requisites: Math skills will be useful in understanding statistics.

Price: \$2400 for Commercial/Government enrollees and \$1200 for Academic researchers and students. Both levels can be bought independently for half this price.

Instructor: Shailender Nagpal

Syllabus:

• - The R platform for statistical analysis
• - Descriptive statistics for summarizing vector and matrix data such as mean, median, quantiles, standard deviation
• - Visualization of data to understand and interpret the underlying distributions
• - Hypothesis testing using parametric and non-parametric tests; t-tests, Wilcoxon tests, rank tests, etc
• - Additional techniques such as F-tests, ANOVA and permutation tests, correlation, regression, t-test, odds ratio, specificity, sensitivity. Project involving gene expression data