Skip to content

CRAM

Definition
AI-generated

CRAM is a reference-based compressed alignment format that stores sequencing read alignments more compactly than BAM by encoding differences relative to a reference sequence.

Why it matters in GWAS

Large sequencing cohorts often store alignments in CRAM to reduce cost, but reproducible GWAS analysis requires careful management of the exact reference build used for compression and decoding.

Example usage

"The biobank distributed aligned reads as CRAM with GRCh38 references for downstream variant calling."

References

  • Fritz MHY, et al. (2011). Efficient storage of high throughput DNA sequencing data using reference-based compression. Genome Research. https://doi.org/10.1101/gr.114819.110

Last updated (UTC · Git history)