Duplicate marking¶
Definition
AI-generated
Duplicate marking is the step in a sequencing pipeline that labels read pairs (or reads) as duplicates—typically PCR or optical duplicates—so downstream steps such as variant calling treat them as a single observation of the same DNA fragment.
Topics
Why it matters in GWAS¶
Duplicate marking reduces inflated read support at sites and stabilizes depth-based QC; omitting it can bias allele-balance and variant-quality statistics used before association testing.
Example usage¶
"Duplicate marking was performed with Picard MarkDuplicates; marked duplicates were excluded from variant discovery."
Related terms¶
References¶
- Picard toolkit: MarkDuplicates. Broad Institute.
Last updated (UTC · Git history)