Skip to content

Academic Paper References

This document lists all academic papers referenced in GWASLab, including methods implemented, sample data sources, and theoretical foundations.

GWASLab

He, Y., Koido, M., Shimmori, Y., Kamatani, Y. (2023). GWASLab: a Python package for processing and visualizing GWAS summary statistics. Preprint at Jxiv, 2023-5. https://doi.org/10.51094/jxiv.370

Methods Implemented

LD Score Regression (LDSC)

Bulik-Sullivan, B. K., Loh, P. R., Finucane, H. K., Ripke, S., Yang, J., Patterson, N., ... & Neale, B. M. (2015). LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature genetics, 47(3), 291-295.

Bulik-Sullivan, B., et al. (2015). An Atlas of Genetic Correlations across Human Diseases and Traits. Nature Genetics, 47(11), 1236-1241.

Finucane, H. K., Reshef, Y. A., Anttila, V., Slowikowski, K., Gusev, A., Byrnes, A., ... & Price, A. L. (2018). Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nature genetics, 50(4), 621-629.

PRS-CS (Polygenic Risk Score - Continuous Shrinkage)

Ge, T., Chen, C. Y., Ni, Y., Feng, Y. C. A., & Smoller, J. W. (2019). Polygenic Prediction via Bayesian Regression and Continuous Shrinkage Priors. Nature Communications, 10(1), 1776.

Approximate Bayesian Factor (ABF) Fine-mapping

Wakefield, J. (2007). A bayesian measure of the probability of false discovery in genetic epidemiology studies. American Journal of Human Genetics, 81(2), 208-227.

Effective Sample Size (METAL)

Willer, C. J., Li, Y., & Abecasis, G. R. (2010). METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics, 26(17), 2190-2191.

Heritability Conversion

Lee, S. H., Wray, N. R., Goddard, M. E., & Visscher, P. M. (2011). Estimating missing heritability for disease from genome-wide association studies. The American Journal of Human Genetics, 88(3), 294-305. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3059431/

Lee, S. H., Goddard, M. E., Wray, N. R., & Visscher, P. M. (2012). A better coefficient of determination for genetic profile analysis. Genetic epidemiology, 36(3), 214-224.

Per-SNP Heritability

Shim, H., Chasman, D. I., Smith, J. D., Mora, S., Ridker, P. M., Nickerson, D. A., ... & Stephens, M. (2015). A multivariate genome-wide association analysis of 10 LDL subfractions, and their response to statin treatment, in 1868 Caucasians. PloS one, 10(4), e0120758.

Power Calculation

Sham, P. C., & Purcell, S. M. (2014). Statistical power and significance testing in large-scale genetic studies. Nature Reviews Genetics, 15(5), 335-346.

Skol, A. D., Scott, L. J., Abecasis, G. R., & Boehnke, M. (2006). Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nature genetics, 38(2), 209-213.

Lead Variant Extraction (GBMI Definition)

Zhou, W., Kanai, M., Wu, K. H. H., Rasheed, H., Tsuo, K., Hirbo, J. B., ... & Study, C. O. H. (2022). Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease. Cell Genomics, 2(10), 100192.

MAGMA (Multi-marker Analysis of GenoMic Annotation)

de Leeuw, C. A., Mooij, J. M., Heskes, T., & Posthuma, D. (2015). MAGMA: Generalized gene-set analysis of GWAS data. PLoS Computational Biology, 11(4), e1004219.

SuSiE (Sum of Single Effects)

Wang, G., Sarkar, A., Carbonetto, P., & Stephens, M. (2020). Simple new approaches to variable selection in regression, with application to genetic fine mapping. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82(5), 1273-1300.

MESuSiE (Multivariate Extension of SuSiE)

Rossen, J., Shi, H., Strober, B. J., Pasaniuc, B., & Price, A. L. (2023). MESuSiE enables scalable and powerful multi-ancestry fine-mapping of causal variants in genome-wide association studies. Nature Genetics, 55(12), 2177-2188.

coloc (Colocalization Analysis)

Giambartolomei, C., Vukcevic, D., Schadt, E. E., Franke, L., Hingorani, A. D., Wallace, C., & Plagnol, V. (2014). Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genetics, 10(5), e1004383.

Wallace, C. (2021). A more accurate method for colocalisation analysis allowing for multiple causal variants. PLoS Genetics, 17(9), e1009440.

SCDRS (Single-cell Disease Relevance Score)

Zhang, M. J., Shi, H., Strober, B. J., & Price, A. L. (2022). scDRS: a method for single-cell disease relevance score computation. Nature Genetics, 54(8), 1163-1171.

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., ... & Sham, P. C. (2007). PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 81(3), 559-575.

BCFtools

Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of SAMtools and BCFtools. GigaScience, 10(2), giab008. https://doi.org/10.1093/gigascience/giab008

Visualization Methods

Brisbane Plot

Yengo, L., Vedantam, S., Marouli, E., Sidorenko, J., Bartell, E., Sakaue, S., ... & Lee, J. Y. (2022). A saturated map of common genetic variants associated with human height. Nature, 610(7933), 704-712.

Trumpet Plot

Corte, L., Liou, L., O'Reilly, P. F., & García-González, J. (2023). Trumpet plots: Visualizing The Relationship Between Allele Frequency And Effect Size In Genetic Association Studies. medRxiv, 2023-04.

Reference Data and Regions

HLA/MHC Region

Horton, R., Wilming, L., Rand, V., Lovering, R. C., Bruford, E. A., Khodiyar, V. K., ... & Beck, S. (2004). Gene map of the extended human MHC. Nature Reviews Genetics, 5(12), 889-899.

Shiina, T., Hosomichi, K., Inoko, H., & Kulski, J. K. (2009). The HLA genomic loci map: expression, interaction, diversity and disease. Journal of human genetics, 54(1), 15-39.

GWAS Catalog

Buniello, A., MacArthur, J. A. L., Cerezo, M., Harris, L. W., Hayhurst, J., Malangone, C., ... & Parkinson, H. (2019). The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic acids research, 47(D1), D1005-D1012.

Sample Data Used in Tutorials

Suzuki, K., Akiyama, M., Ishigaki, K., Kanai, M., Hosoe, J., Shojima, N., ... & Kamatani, Y. (2019). Identification of 28 new susceptibility loci for type 2 diabetes in the Japanese population. Nature genetics, 51(3), 379-386.

Akiyama, M., Okada, Y., Kanai, M., Takahashi, A., Momozawa, Y., Ikeda, M., ... & Kamatani, Y. (2017). Genome-wide association study identifies 112 new loci for body mass index in the Japanese population. Nature genetics, 49(10), 1458-1467.

Akiyama, M., Ishigaki, K., Sakaue, S., Momozawa, Y., Horikoshi, M., Hirata, M., ... & Kamatani, Y. (2019). Characterizing rare and low-frequency height-associated variants in the Japanese population. Nature communications, 10(1), 4393.