Skip to content

Scatter & Distribution plot : allele frequency comparison

!! info "Available from v3.4.15"

.check_af()

#check the difference between the EAF in the sumstats and the allele frequency in VCF files
sumstats.check_af()

.plot_daf()

#allele frequnecy correlation plot
sumstats.plot_daf()

You need to run 'check_af()' first before plotting. For check_af(), see here.

Options for plot_daf: threshold: float, the threshold used to determine outliers.

Examples

Example

mysumstats = gl.Sumstats("t2d_bbj.txt.gz",
             snpid="SNP",
             chrom="CHR",
             pos="POS",
             ea="ALT",
             nea="REF",
             neaf="Frq",
             beta="BETA",
             se="SE",
             p="P",
             direction="Dir",
             n="N",nrows=10000)

# harmonize
mysumstats.harmonize(basic_check = True, 
                     ref_seq=gl.get_path("ucsc_genome_hg19"))

# check the difference in allele frequency with reference vcf
mysumstats.check_af(ref_infer=gl.get_path("1kg_eas_hg19"), 
                    ref_alt_freq="AF",
                    n_cores=2)

plot and get the outliers
outliers = mysumstats.plot_daf(threshold=0.12, 
                                save="af_correlation.png",
                                save_args={"dpi":300})

outliers[1]