Skip to content

PGEN

Definition
AI-generated

PGEN is the primary binary genotype format used by PLINK 2, accompanied by `.pvar` for variant metadata and `.psam` for sample metadata.

Topics

Why it matters in GWAS

Many current array and imputed-data pipelines use PGEN because it supports large sample sizes and dosage-aware analysis without forcing everything into text-based VCF. Recognizing the companion files and their roles makes conversion, QC, and downstream association testing less error-prone.

Example usage

"Imputed dosages were imported to PGEN and analyzed with plink2 --glm after sample and variant QC."

References

  • Chang CC, et al. (2015). Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. https://doi.org/10.1186/s13742-015-0047-8
  • PLINK 2 file format reference: https://www.cog-genomics.org/plink/2.0/formats#pgen

Last updated (UTC · Git history)