This Is An Introductory Course On Cluster Computing And Computing In The Cloud. Software Such As LSF And The Amazon EC2 Cloud Are Introduced.

Introduction: This course aims at teaching Biologists how to implement High Performance Computing (HPC) strategies for data management and analysis tasks in biological research and resolve difficult BIG DATA related problems. Data could be DNA or Amino acid sequences, Microarray or NGS data, Images, Mass spectrometry data, text articles or any other kind of biological information

Why learn HPC? HPC techniques are increasingly relied upon to work with large datasets to reduce the computation time as well as make analysis feasible. With datasets of ever increasing size, people without HPC skills will be left stranded in a sea of data with no way to get out of it.

Logistics:This online HPC course is divided into four sessions with slides, pre-recorded videos, reference cards, examples, data, scripts and quizzes. Enrollees can contact the instructor with questions and get help on the projects. The main topics are listed below. Homework assignments will involve running commands learned in the live lectures.

Pre-requisites: Basic linux and shell scripting knowledge is required to take this course.

Price:$1200 for Commercial/Government enrollees and $600 for Academic researchers and students.


  • Basic architectures and methods for implementing HPC on powerful servers, clusters and the cloud
  • Multi-threaded applications for sequence search and read alignment
  • Usage of multiple cores simultaneously through linux processes
  • Cloud computing and how to use the Amazon web service to request resources (instances) at will, add datasets and run applications