Basics
Recommended Prerequisite Knowledge
Suggested Background Reading
Readers who are new to human genetics, population genetics, genomics, or statistics may benefit from reviewing some introductory materials before starting GWASTutorial.
Human genetics and genomics
-
Thompson & Thompson Genetics and Genomics in Medicine
A standard textbook for human genetics and genomics. It is a good choice for readers who need a structured introduction to human genetic variation, disease genetics, and modern genomics. -
Human Molecular Genetics (Strachan et al.)
A more advanced textbook that provides deeper coverage of genes, genomes, genetic variation, and molecular mechanisms. It is best suited for readers who already know the basics and want a stronger conceptual foundation.
Population genetics
-
A Primer of Population Genetics and Genomics (Daniel L. Hartl)
A concise and accessible introduction to population genetics, covering core concepts such as allele frequencies, Hardy–Weinberg equilibrium, genetic drift, selection, and population structure. It is especially useful for learners preparing for GWAS, where understanding allele frequency, LD, and population stratification is essential. -
EPI 511, Advanced Population and Medical Genetics (Alkes Price, Harvard School of Public Health)
A graduate-level course with lectures and materials on population genetics, statistical genetics, and related topics in human disease genetics.
Statistics
-
Penn State STAT 501 course notes
A free and practical introduction to regression and statistical inference. For many learners, this is sufficient background before beginning GWAS-focused study. -
StatQuest
A beginner-friendly video resource that explains core statistical concepts with clear visual and intuitive examples. It is particularly useful for readers who need a more accessible introduction to topics that frequently appear in GWAS, such as linear regression, logistic regression, p-values, multiple testing, and principal component analysis. StatQuest is not a substitute for formal statistical training, but it is an excellent supplementary resource for building intuition before starting GWAS analysis.
Minimal Topics to Know Before Learning GWAS
Before starting GWASTutorial, readers should be familiar with the following core topics.
1. Molecular and genome basics
Basic concepts in molecular biology and genome organization, including: - DNA - RNA - gene - genome - chromosome - autosome - sex chromosome - X chromosome - Y chromosome - mitochondrial DNA - genomic region - locus - genomic position
2. Genetic variation
Basic forms of genetic variation and how they are described, including: - variant - polymorphism - SNP - SNV - indel - structural variant - mutation - allele - reference allele - alternate allele - rsID
3. Genotypes and inheritance
Basic concepts of genotype structure and inheritance, including: - genotype - haplotype - diploid - haploid - homozygous - heterozygous - hemizygous - recombination - linkage
4. Traits and population genetics
Basic concepts for understanding how traits and variants behave in populations, including: - phenotype - trait - binary trait - quantitative trait - allele frequency - genotype frequency - Hardy–Weinberg equilibrium (HWE) - heritability - penetrance - ancestry - population stratification - relatedness - principal component analysis (PCA)
5. Statistics and association analysis
Basic statistical concepts commonly used in GWAS, including: - mean - variance - standard deviation - probability distribution - correlation - linear regression - logistic regression - effect size - standard error - confidence interval - hypothesis testing - p-value - multiple testing correction
For current standard GWAS and post-GWAS analysis, it is also helpful to have basic familiarity with: - linear mixed models (LMM) - generalized linear mixed models (GLMM) - random effects - variance components - Bayesian statistics - posterior probability - credible sets - fine-mapping concepts
6. Genotyping and sequencing technologies
Basic understanding of how genetic data are generated, including: - SNP array - genotyping - imputation - next-generation sequencing (NGS) - whole-genome sequencing (WGS) - whole-exome sequencing (WES)
7. Study design and data handling
Basic knowledge of how GWAS data are organized and analyzed, including: - sample - variant-level data - covariate - case-control study - quantitative trait study - quality control (QC) - missingness - minor allele frequency (MAF) - genome build - reference genome
8. Genomic references and historical resources
Basic awareness of major reference resources used in human genetics, including: - HapMap - 1000 Genomes Project (1KG) - linkage disequilibrium (LD) - reference panel
UMAP or other low-dimensional visualization methods may be useful in some contexts, but they are not essential prerequisite knowledge for learning standard GWAS.