Beyond Genomics: Single-Cell Genomics
Single-cell genomics (scRNA-seq, scATAC-seq, spatial transcriptomics, and multiome assays) resolves molecular variation at the level of individual cells. When integrated with GWAS, these data enable a conceptual shift from "locus discovery" to "cellular mechanism inference", linking genetic risk to specific cell types, states, regulatory programs, and spatial contexts.
This section presents a refined framework for GWAS–single-cell integration, organized by the biological question being asked and the resolution of inference.
Required data and tools
Needs depend on the method (see Approaches). Typically you need GWAS summary statistics (and often LD scores or a reference panel), plus method-specific software — e.g. LDSC, MAGMA, or single-cell atlases and integration tools cited in each subsection.
On this page
Why Integrate GWAS with Single-Cell Data?
GWAS robustly identifies trait-associated loci, but typically leaves some key questions unanswered:
- Which cell types mediate genetic risk?
- Which cellular states or programs are involved?
- Where in tissue architecture does risk manifest?
Single-cell datasets address these gaps by enabling:
- Cell-type and cell-state resolution of GWAS signals
- Gene and regulatory program prioritization in relevant cells
- Dissection of heterogeneous tissues (immune system, brain, tumor microenvironment)
- Cell-type-specific genetic architectures (e.g. sc-eQTL, scATAC-QTL, state-dependent effects)
Framework for GWAS–Single-Cell Methods
Methods can be organized along two orthogonal axes:
Genetic abstraction level
- Variant / heritability-based
- Gene-based
- Cell-based
- Spatial / tissue-context-based
Biological resolution
- Cell type
- Cell state / program
- Regulatory mechanism
- Spatial niche
Approaches
1. Cell-type heritability and gene-set enrichment
(Variant-level or gene-level; population-wide signal)
Representative methods: - LDSC-seg (stratified LD Score Regression) - GitHub - MAGMA (gene-property and gene-set analysis) - Software
Core question: "Are GWAS signals enriched in genes or annotations specific to certain cell types?"
Core idea: - Derive cell-type-specific annotations from expression or chromatin data - Partition GWAS heritability (LDSC-seg) or gene-level association (MAGMA) - Test enrichment while accounting for LD and baseline genomic features
Typical inputs: - GWAS summary statistics - Cell-type aggregated expression or accessibility profiles - LD reference panels
Typical outputs: - Enrichment statistics per cell type or tissue
Strengths: - Robust, LD-aware, interpretable - Ideal as a first-pass prioritization step
Limitations: - Limited resolution (cell type rather than individual cells) - Sensitive to gene-to-SNP mapping choices
Workflow:
GWAS Summary Statistics
↓
LD Reference Panels
↓
Cell-type Aggregated Expression/Accessibility Profiles
↓
[LDSC-seg: Stratified LD Score Regression]
[MAGMA: Gene-property Analysis]
↓
Enrichment Statistics per Cell Type
2. Per-cell and per-state disease relevance scoring
(Cell-level resolution; heterogeneity-aware)
Representative methods: - scDRS - GitHub
Core question: "Which individual cells or cellular states are most relevant to a given disease?"
Core idea: - Convert GWAS summary statistics into gene-level disease scores - Score each cell by coordinated expression of disease-associated genes - Use matched control gene sets for calibration and statistical testing
Typical inputs: - scRNA-seq data (cell-by-gene expression matrix) - GWAS summary statistics or derived gene scores
Typical outputs: - Disease relevance score per cell - Cluster- or state-level summaries
Strengths: - Captures within–cell-type heterogeneity - Highlights rare, transient, or activated states
Limitations: - Expression-based (does not directly model regulatory variants) - Interpretation is correlational rather than causal
Workflow:
GWAS Summary Statistics
↓
Gene-level Disease Scores
↓
scRNA-seq Data (Cell × Gene Matrix)
↓
[scDRS: Score Each Cell]
↓
Disease Relevance Score per Cell
↓
Cluster/State-level Summaries
3. Variant-to-gene-to-cell-type linking
(Mechanistic and regulatory interpretation)
Representative methods: - sc-linker - GitHub
Core question: "Which genes mediate GWAS loci, and in which cell types do they act?"
Core idea: - Integrate GWAS loci with single-cell chromatin accessibility and expression - Link non-coding variants to regulatory elements - Connect regulatory elements to target genes in a cell-type-specific manner
Typical inputs: - GWAS summary statistics or fine-mapped loci - scATAC-seq / multiome data - Single-cell gene expression
Typical outputs: - Prioritized causal genes per locus - Cell-type-specific regulatory links
Strengths: - Moves toward causal interpretation - Explicitly models regulatory context
Limitations: - Requires high-quality regulatory maps - Peak-to-gene linking remains noisy
Workflow:
GWAS Summary Statistics / Fine-mapped Loci
↓
scATAC-seq / Multiome Data
↓
Single-cell Gene Expression
↓
[sc-linker: Link Variants → Regulatory Elements → Genes]
↓
Prioritized Causal Genes per Locus
↓
Cell-type-specific Regulatory Links
4. Spatial mapping of genetic risk
(Tissue architecture and microenvironment context)
Representative methods: - gsMap - GitHub
Core question: "Where in the tissue does genetic risk manifest?"
Core idea: - Use Graph Neural Network (GNN) to identify homogeneous spots (microdomains) based on gene expression patterns and spatial coordinates - Compute gene specificity scores (GSS) for each spot by comparing gene expression ranks within microdomains versus the entire section - Map GSS to SNPs via distance-based windows (±50 kb from transcription start sites) and SNP-to-gene linking maps - Apply stratified LD Score Regression (S-LDSC) to test whether SNPs with higher GSS are enriched for trait heritability - Aggregate spot-level associations to spatial regions using the Cauchy combination test
Typical inputs: - Spatial transcriptomics data (with spatial coordinates and gene expression profiles) - GWAS summary statistics - LD reference panels - SNP-to-gene linking maps (e.g., from epigenomic data) - Optional: cell type annotation priors
Typical outputs: - Enrichment statistics and P-values per spatial spot - Spatial maps of trait-associated cells or regions - Region-level aggregated associations
Strengths: - Addresses sparsity and technical noise in ST data through microdomain aggregation - Provides spatially resolved mapping at cellular resolution - Adds anatomical and microenvironmental context - Essential for brain, cancer, and developmental studies
Limitations: - Resolution depends on spatial technology (spot-level in high-resolution platforms, cluster-level in conventional platforms) - Requires high-quality SNP-to-gene linking maps - Computational intensity increases with spatial resolution
Workflow:
Spatial Transcriptomics Data (Expression + Coordinates)
↓
[GNN: Identify Homogeneous Spots / Microdomains]
↓
[Compute Gene Specificity Scores (GSS) per Spot]
↓
[Map GSS to SNPs via Distance & SNP-to-Gene Links]
↓
GWAS Summary Statistics + LD Reference Panels
↓
[S-LDSC: Test Heritability Enrichment per Spot]
↓
[Cauchy Combination Test: Aggregate to Regions]
↓
Spatial Maps of Trait-associated Spots/Regions
Conceptual Summary Table
| Method class | Resolution | Primary question |
|---|---|---|
| LDSC-seg / MAGMA | Cell type | Which cell types are enriched? |
| scDRS | Individual cells | Which cells or states matter? |
| sc-linker | Gene + cell type | Which genes mediate risk? |
| gsMap | Spatial regions | Where does risk manifest? |
References
Review papers
- Cuomo, A. S. E., Nathan, A., Raychaudhuri, S., MacArthur, D. G., & Powell, J. E. (2023). Single-cell genomics meets human genetics. Nature Reviews Genetics, 24(8), 535–549. https://doi.org/10.1038/s41576-023-00598-6
Method papers
-
Finucane, H. K., Reshef, Y. A., Anttila, V., Slowikowski, K., Gusev, A., Byrnes, A., et al. (2018). Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nature Genetics, 50(4), 621–629. https://doi.org/10.1038/s41588-018-0081-4
-
de Leeuw, C. A., Mooij, J. M., Heskes, T., & Posthuma, D. (2015). MAGMA: generalized gene-set analysis of GWAS data. PLoS Computational Biology, 11(4), e1004219. https://doi.org/10.1371/journal.pcbi.1004219
-
Zhang, M. J., Hou, K., Dey, K. K., et al. (2022). Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nature Genetics, 54(9), 1344–1350. https://doi.org/10.1038/s41588-022-01167-z (scDRS)
-
Jagadeesh, K. A., Dey, K. K., Montoro, D. T., Mohan, R., Gazal, S., Engreitz, J. M., Xavier, R. J., Price, A. L., Regev, A., et al. (2022). Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nature Genetics, 54(10), 1479–1492. https://doi.org/10.1038/s41588-022-01187-9 (sc-linker)
-
Song, L., Chen, W., Hou, J., Guo, M., & Yang, J. (2025). Spatially resolved mapping of cells associated with human complex traits. Nature, 641, 932–941. https://doi.org/10.1038/s41586-025-08757-x (gsMap)