Beyond Genomics: Single-Cell Genomics
Single-cell genomics (scRNA-seq, scATAC-seq, spatial transcriptomics, and multiome assays) resolves molecular variation at the level of individual cells. When integrated with GWAS, these data enable a conceptual shift from "locus discovery" to "cellular mechanism inference", linking genetic risk to specific cell types, states, regulatory programs, and spatial contexts.
This section presents a refined framework for GWAS–single-cell integration, organized by the biological question being asked and the resolution of inference.
Why Integrate GWAS with Single-Cell Data?
GWAS robustly identifies trait-associated loci, but typically leaves some key questions unanswered:
- Which cell types mediate genetic risk?
- Which cellular states or programs are involved?
- Where in tissue architecture does risk manifest?
Single-cell datasets address these gaps by enabling:
- Cell-type and cell-state resolution of GWAS signals
- Gene and regulatory program prioritization in relevant cells
- Dissection of heterogeneous tissues (immune system, brain, tumor microenvironment)
- Cell-type-specific genetic architectures (e.g. sc-eQTL, scATAC-QTL, state-dependent effects)
Framework for GWAS–Single-Cell Methods
Methods can be organized along two orthogonal axes:
Genetic abstraction level
- Variant / heritability-based
- Gene-based
- Cell-based
- Spatial / tissue-context-based
Biological resolution
- Cell type
- Cell state / program
- Regulatory mechanism
- Spatial niche
Approaches
1. Cell-type heritability and gene-set enrichment
(Variant-level or gene-level; population-wide signal)
Representative methods: - LDSC-seg (stratified LD Score Regression) - GitHub - MAGMA (gene-property and gene-set analysis) - Software
Core question: "Are GWAS signals enriched in genes or annotations specific to certain cell types?"
Core idea: - Derive cell-type-specific annotations from expression or chromatin data - Partition GWAS heritability (LDSC-seg) or gene-level association (MAGMA) - Test enrichment while accounting for LD and baseline genomic features
Typical inputs: - GWAS summary statistics - Cell-type aggregated expression or accessibility profiles - LD reference panels
Typical outputs: - Enrichment statistics per cell type or tissue
Strengths: - Robust, LD-aware, interpretable - Ideal as a first-pass prioritization step
Limitations: - Limited resolution (cell type rather than individual cells) - Sensitive to gene-to-SNP mapping choices
Workflow:
GWAS Summary Statistics
↓
LD Reference Panels
↓
Cell-type Aggregated Expression/Accessibility Profiles
↓
[LDSC-seg: Stratified LD Score Regression]
[MAGMA: Gene-property Analysis]
↓
Enrichment Statistics per Cell Type
2. Per-cell and per-state disease relevance scoring
(Cell-level resolution; heterogeneity-aware)
Representative methods: - scDRS - GitHub
Core question: "Which individual cells or cellular states are most relevant to a given disease?"
Core idea: - Convert GWAS summary statistics into gene-level disease scores - Score each cell by coordinated expression of disease-associated genes - Use matched control gene sets for calibration and statistical testing
Typical inputs: - scRNA-seq data (cell-by-gene expression matrix) - GWAS summary statistics or derived gene scores
Typical outputs: - Disease relevance score per cell - Cluster- or state-level summaries
Strengths: - Captures within–cell-type heterogeneity - Highlights rare, transient, or activated states
Limitations: - Expression-based (does not directly model regulatory variants) - Interpretation is correlational rather than causal
Workflow:
GWAS Summary Statistics
↓
Gene-level Disease Scores
↓
scRNA-seq Data (Cell × Gene Matrix)
↓
[scDRS: Score Each Cell]
↓
Disease Relevance Score per Cell
↓
Cluster/State-level Summaries
3. Variant-to-gene-to-cell-type linking
(Mechanistic and regulatory interpretation)
Representative methods: - sc-linker - GitHub
Core question: "Which genes mediate GWAS loci, and in which cell types do they act?"
Core idea: - Integrate GWAS loci with single-cell chromatin accessibility and expression - Link non-coding variants to regulatory elements - Connect regulatory elements to target genes in a cell-type-specific manner
Typical inputs: - GWAS summary statistics or fine-mapped loci - scATAC-seq / multiome data - Single-cell gene expression
Typical outputs: - Prioritized causal genes per locus - Cell-type-specific regulatory links
Strengths: - Moves toward causal interpretation - Explicitly models regulatory context
Limitations: - Requires high-quality regulatory maps - Peak-to-gene linking remains noisy
Workflow:
GWAS Summary Statistics / Fine-mapped Loci
↓
scATAC-seq / Multiome Data
↓
Single-cell Gene Expression
↓
[sc-linker: Link Variants → Regulatory Elements → Genes]
↓
Prioritized Causal Genes per Locus
↓
Cell-type-specific Regulatory Links
4. Spatial mapping of genetic risk
(Tissue architecture and microenvironment context)
Representative methods: - gsMap - GitHub
Core question: "Where in the tissue does genetic risk manifest?"
Core idea: - Use Graph Neural Network (GNN) to identify homogeneous spots (microdomains) based on gene expression patterns and spatial coordinates - Compute gene specificity scores (GSS) for each spot by comparing gene expression ranks within microdomains versus the entire section - Map GSS to SNPs via distance-based windows (±50 kb from transcription start sites) and SNP-to-gene linking maps - Apply stratified LD Score Regression (S-LDSC) to test whether SNPs with higher GSS are enriched for trait heritability - Aggregate spot-level associations to spatial regions using the Cauchy combination test
Typical inputs: - Spatial transcriptomics data (with spatial coordinates and gene expression profiles) - GWAS summary statistics - LD reference panels - SNP-to-gene linking maps (e.g., from epigenomic data) - Optional: cell type annotation priors
Typical outputs: - Enrichment statistics and P-values per spatial spot - Spatial maps of trait-associated cells or regions - Region-level aggregated associations
Strengths: - Addresses sparsity and technical noise in ST data through microdomain aggregation - Provides spatially resolved mapping at cellular resolution - Adds anatomical and microenvironmental context - Essential for brain, cancer, and developmental studies
Limitations: - Resolution depends on spatial technology (spot-level in high-resolution platforms, cluster-level in conventional platforms) - Requires high-quality SNP-to-gene linking maps - Computational intensity increases with spatial resolution
Workflow:
Spatial Transcriptomics Data (Expression + Coordinates)
↓
[GNN: Identify Homogeneous Spots / Microdomains]
↓
[Compute Gene Specificity Scores (GSS) per Spot]
↓
[Map GSS to SNPs via Distance & SNP-to-Gene Links]
↓
GWAS Summary Statistics + LD Reference Panels
↓
[S-LDSC: Test Heritability Enrichment per Spot]
↓
[Cauchy Combination Test: Aggregate to Regions]
↓
Spatial Maps of Trait-associated Spots/Regions
Conceptual Summary Table
| Method class | Resolution | Primary question |
|---|---|---|
| LDSC-seg / MAGMA | Cell type | Which cell types are enriched? |
| scDRS | Individual cells | Which cells or states matter? |
| sc-linker | Gene + cell type | Which genes mediate risk? |
| gsMap | Spatial regions | Where does risk manifest? |
References
Review papers
- Cuomo, A. S. E., Nathan, A., Raychaudhuri, S., MacArthur, D. G., & Powell, J. E. (2023). Single-cell genomics meets human genetics. Nature Reviews Genetics, 24(8), 535–549. https://doi.org/10.1038/s41576-023-00598-6
Method papers
-
Finucane, H. K., Reshef, Y. A., Anttila, V., Slowikowski, K., Gusev, A., Byrnes, A., et al. (2018). Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nature Genetics, 50(4), 621–629. https://doi.org/10.1038/s41588-018-0081-4
-
de Leeuw, C. A., Mooij, J. M., Heskes, T., & Posthuma, D. (2015). MAGMA: generalized gene-set analysis of GWAS data. PLoS Computational Biology, 11(4), e1004219. https://doi.org/10.1371/journal.pcbi.1004219
-
Zhang, M. J., Hou, K., Dey, K. K., et al. (2022). Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nature Genetics, 54(9), 1344–1350. https://doi.org/10.1038/s41588-022-01167-z (scDRS)
-
Jagadeesh, K. A., Dey, K. K., Montoro, D. T., Mohan, R., Gazal, S., Engreitz, J. M., Xavier, R. J., Price, A. L., Regev, A., et al. (2022). Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nature Genetics, 54(10), 1479–1492. https://doi.org/10.1038/s41588-022-01187-9 (sc-linker)
-
Song, L., Chen, W., Hou, J., Guo, M., & Yang, J. (2025). Spatially resolved mapping of cells associated with human complex traits. Nature, 641, 932–941. https://doi.org/10.1038/s41586-025-08757-x (gsMap)