Co-localization
Coloc assuming a single causal variant
Coloc
uses the assumption of 0 or 1 causal variant in each trait,
and tests for whether they share the same causal variant.
Note
Actually such a assumption is different from fine-mapping. In fine-mapping, the aim is to find the putative causal variants, which is determined at birth. In colocalization, the aim is to find the "signal overlapping" to support the causality inference, like eQTL --> A trait. It is possible that the causal variants are different in two traits.
Datasets used:
- For binary traits,
coloc
requires "beta", "varbeta", and "snp". For quantitative traits, the trait standard deviation "sdY" is required to estimate the scale of estimated beta. - LD matrix will be a square numeric matrix of dimension equal to the number of SNPs, with dimnames corresponding to the SNP ids.
Result interpretation:
Basically, five configurations are calculated,
## PP.H0.abf PP.H1.abf PP.H2.abf PP.H3.abf PP.H4.abf
## 1.73e-08 7.16e-07 2.61e-05 8.20e-05 1.00e+00
## [1] "PP abf for shared variant: 100%"
\(H_0\): neither trait has a genetic association in the region
\(H_1\): only trait 1 has a genetic association in the region
\(H_2\): only trait 2 has a genetic association in the region
\(H_3\): both traits are associated, but with different causal variants
\(H_4\): both traits are associated and share a single causal variant
PP.H4.abf
is the posterior probability that two traits share a same causal variant.
Then based on H4
is true, a 95% credible set could be constructed (as a shared causal variant does not necessarily mean a specific variant).
o <- order(my.res$results$SNP.PP.H4,decreasing=TRUE)
cs <- cumsum(my.res$results$SNP.PP.H4[o])
w <- which(cs > 0.95)[1]
my.res$results[o,][1:w,]$snp
References:
Coloc assuming multiple causal variants or multiple signals
When the single-causal variant assumption is violeted, several ways could be used to relieve it.
-
Assuming multiple causal variants in SuSiE-Coloc pipeline. In this pipeline, putative causal variants are fine-mapped, then each signal is passed to the coloc engine.
-
Conditioning analysis using GCTA-COJO-Coloc pipeline. In this pipeline, signals are segregated, then passed to the coloc engine.
Other pipelines
Many other strategies and pipelines are available for colocalization and prioritize the variants/genes/traits. For example: * HyPrColoc * OpenTargets *