Skip to content

Nearest gene vs Causal gene

Definition

In post-GWAS follow-up, nearest gene usually means a *heuristic* label: the gene whose transcription start site or body is closest to a lead SNP (or the best-supported interval), used for quick reporting. Causal gene (or gene at the locus) is a *mechanistic* claim: the gene whose expression or function is altered by the variant that actually drives the trait—often unknown until fine-mapping, eQTL/splicing, perturbation, or other functional evidence.

How they differ

Nearest gene Causal gene
Basis Genomic distance or annotation pipeline rules (e.g. closest TSS). Biological evidence linking variant → gene → trait.
Certainty Deterministic given a coordinate and build; can be wrong mechanistically. Uncertain until supported; may be non–nearest-neighbor (e.g. long-range regulatory targets).
Typical next steps Catalog display, hypothesis lists, pathway enrichment. Fine-mapping, colocalization, TWAS/single-cell readouts, CRISPR or other perturbations.

Rule of thumb: Treat “nearest gene” as a convenience label; treat “causal gene” as a hypothesis that requires evidence beyond proximity.

References

  • Nica AC, Dermitzakis ET. (2013). Expression quantitative trait loci in humans. Genome Res.
  • Schaid DJ, Chen W, Larson NB. (2018). From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet.