Skip to the content.

Helen Hipperson, Nicola Nadeau & Alison Wright

Department of Animal and Plant Sciences, University of Sheffield

This practical is part of the module Advanced Data Analysis - Introduction to NGS data analysis. The aim is to learn how to call single nucleotide polymorphism (SNPs) and genotypes, that is the process of identifying variable sites and determining the genotype for each individual at each site. We will be using a dataset of whole genome sequence data of 32 individuals of Heliconius melpomene. After calling SNPs, we will do some subsetting and filtering and will carry out a few example analyses.

Originally written by Victor Soria-Carrasco

Table of contents

  1. Initial set up
  2. SNP and genotype calling with BCFtools
  3. VCF and BCF format
  4. SNP and genotype calling with GATK
  5. Operations with BCF files
  6. Practical application: Population structure with NGSADMIX
  7. Practical application: PCA of genoypes with R

Extras:


Resources

References