Statistics is a mathematical science including methods of collecting, organizing and analyzing data in such a way that meaningful conclusions can be drawn from them. The data are presented in the form of tables and graphs. The characteristics of the data are described in simple terms.
IntroductionThis course is for researchers who routinely work with data and need to analyze and visualize it in a statistics package (and Excel is not providing them with that capability). Basic statistics and plotting of data with the statistical programming package R is covered. Attendees will learn how to implement summary statistics, hypothesis tests and statistical techniques such as regression, PCA, t-tests, ANOVA, etc. Automated biological data analysis, development of algorithms, and workflows will also be covered
Why learn Statistics?Statistics is the language for summarizing and reporting on data. It provides a robust framework to provide guidance and advice based on analysis of experimental data.
Logistics:This online course is divided into 8 sessions with pre-recorded videos, handouts, reference cards, examples, data, scripts and quizzes. Enrollees can contact the instructor with questions and get help on the projects. The main topics are listed below, but we teach mostly everything there is to know about using statistics for Bioinformatics. Homework assignments will involve running commands learned in the live lectures.
Pre-requisites: Math skills will be useful in understanding statistics.
Price: $2400 for Commercial/Government enrollees and $1200 for Academic researchers and students. Both levels can be bought independently for half this price.
Instructor: Shailender Nagpal
- - The R platform for statistical analysis
- - Descriptive statistics for summarizing vector and matrix data such as mean, median, quantiles, standard deviation
- - Visualization of data to understand and interpret the underlying distributions
- - Hypothesis testing using parametric and non-parametric tests; t-tests, Wilcoxon tests, rank tests, etc
- - Additional techniques such as F-tests, ANOVA and permutation tests, correlation, regression, t-test, odds ratio, specificity, sensitivity. Project involving gene expression data
This course is for researchers working with multivariate data and need to analyze using Machine Learning algorithms. Attendees will learn to identify variables that explain outcomes in an experiment.
Why study Machine Learning? Machine Learning techniques rapidly answer questions about data, and can be used as predictors for new datasets.
Logistics: The course is divided into 4 sessions with pre-recorded videos, handouts, reference cards, examples, data, scripts and quizzes. Enrollees can contact the instructor with questions and get help on the projects. The main topics are listed below, but we teach mostly everything there is to know about using Machine Learning for Bioinformatics. Homework assignments will involve running commands learned in the live lectures. Math and statistical skills will be useful in understanding the content
Price: $1200 for Commercial/Government enrollees and $900 for Academic researchers and students. Both levels can be bought independently for half this price. Discounts are available for multiple attendees from the same organization or for individuals taking multiple courses. To sign up, click here.
Syllabus: [Student Login]
- - Introduction to the concept of Machine Learning
- - Python implementations
- - Support vector machines, k-nearest neighbors, k-means
- - Decision trees, Naive Bayes and regression
- - Choosing a machine learning algorithm
- - Data collection, feature identification, training, bootstrapping and validation
Price: For scheduled live online courses, the fees are $1,000 (Commercial/Government enrollees) and $600 (Academic researchers and students). To take this as a self-paced training course, the price is $600 and $400, respectively. To register, please contact us or go to our affiliate website, Bioinformatician.net
Discounts: Deep discounts are available for multiple attendees from the same organization or for individuals taking multiple courses. Contact us for details.
Introduction: Probability distributions are mathematical models to depict the spread of data using parameters. Some well known ones include the Binomial, Normal, Poisson and Beta distributions. Modeling data from these distributions allows us to uncover the parameters and use some assumptions to perform statistical hypothesis tests. Essential descriptive statistics are reviewed and then used in various situations to calculate background, noise, normalization and thresholding. Data visualization using various graphs will also be reviewed. Armed with these techniques, students will be able to better deal with the challenges of data analysis and understand data at a more fundamental level.
Why learn about Distributions? students will be able to better deal with the challenges of data analysis and understand data at a more fundamental level
Logistics: This online will be conducted in R, a free and open-source package for statistical computing that has become an essential part of Bioinformatics. This course focuses on the mathematical principles of statistical analysis and not on the syntax and functionality of R. The course is divided into 4 sessions. No experience is required, although prior experience with a programming language will be helpful. Math skills will also be useful.
Price:$1200 for Commercial/Government enrollees and $900 for Academic researchers and students.
Instructor: Shailender Nagpal
- Probability distributions and how to work with them
- Descriptive statistics for summarizing vector and matrix data
- Student t-tests, Wilcoxon tests for analyzing one and two sample data
- Use of the techniques covered in the previous sessions to do a biological data analysis project