Skip to content

GenomicsDB

Definition
AI-generated

GenomicsDB is a columnar storage format and library used to hold many sample gVCFs (or equivalent chunked variant data) for scalable joint genotyping.

Why it matters in GWAS

Large WGS studies that report GATK-style joint calling often cite GenomicsDB as the backing store for thousands of samples; recognizing it clarifies how cohort call sets are assembled and why disk layout and import intervals matter for reproducibility.

Example usage

"gVCFs were consolidated with GenomicsDBImport into a GenomicsDB store per chromosome, then jointly genotyped with GenotypeGVCFs."

References

  • Poplin R, et al. (2018). Scaling accurate genetic variant discovery to tens of thousands of samples. Nature. https://doi.org/10.1038/nature24239

Last updated (UTC · Git history)