Skip to content

Manhattan plot and QQ plot

GWASLab provides a customizable plotting function for Manhattan and Q-Q plots.

.plot_mqq()

.plot_mqq()

A simple example

Quick Manhattan and Q-Q plot without any options

mysumstats.plot_mqq()

image

See other examples here.

Options

By setting the options, you can create highly customized Manhattan plots and Q-Q plots.

A customized Manhattan and QQ plot

mysumstats.plot_mqq(
                  mode="qqm",
                  cut=14,
                  skip=3, 
                  anno_set=["rs12509595","rs7989823"] ,
                  pinpoint=["rs7989823"], 
                  highlight=["rs12509595","19:15040733:T:C"],
                  highlight_windowkb =1000,
                  stratified=True,
                  marker_size=(5,10),
                  figargs={"figsize":(15,5),"dpi":300})

image

Plot layout

Option DataType Description Default
mode mqq,qqm,qq,m Determine the layout of manhattan plot and qq plot.
mqq: left manhatan, right QQ plot
qqm: left QQ plot, right Manhattan plot
m: only Manhattan plot
"qq": only qq plot
mqq
mqqratio float width ratio of Manhattan plot and QQ plot 3

Layout

image


Use MLOG10P for extreme P values

Option DataType Description Default
scaled boolean By default, GWASLab uses P values for mqq plot. But you can set scaled=Ture to use MLOG10P to plot. False

Variant with extreme P values

To plot the variant with extreme P values (P < 1e-300), you can use scaled=False to create the plot with MLOG10P instead of raw P values. To calculate MLOG10P for extreme P values from BETA/SE or Z scores, you can use mysumstats.fill_data(to_fill=["MLOG10P"], extreme=True). For details, please refer to the "Extreme P values" section in https://cloufield.github.io/gwaslab/Conversion/.

X axis: Physical position or rank

Option DataType Description Default
use_rank boolean If True, GWASLab will use position rank instead of the physical base-pair positions for x aixs. False

Note

If using rank, there will be no gap in the plot. If using base-pair positions, certain regions of the chromosome might be reflected in the plot like the heterochromatin.

Y axis: Skip "low" and shrink "high"

Option DataType Description Default
skip float Sometimes it is not necessary to plot all variants, we can skip the variants with low -log10(P) values for plotting. For example, we can omit varints with -log10(P) lower than 3 from the plot by specifying skip=3. Calculation of lambda GC won't be affected by this None
cut float loci with extremly large -log10(P) value are very likely to dwarf other significant loci, so we want to scale down the -log10(P) for variants above a certain threshold. None
cutfactor float shrinkage factor 10
cut_line_color float the color of the line above which y axis is rescaled. 500
sig_level float genome-wide significance threshold 5e-8

Auxiliary lines

image

Note

lambda GC calculation for QQ plot will not be affected by skip and cut. The calculation is conducted using all variants in the original dataset.


Annotation

Option DataType Description Default
anno boolean or string or "GENENAME" If anno = True, variants will anotated with chr:pos; or string, the column name used for annotation; or "GENENAME", automatically annotate nrearest gene names, using pyensembl. (remember to specify build, default is build="19") False
anno_set list If you want to annotate only a few specific variants, you can simply provide a list of SNPIDs or rsIDs for annotation. If None, the variants to annotate will be selected automatically using a sliding window with windowsize=500kb. None
repel_force float when the annotation overlaps with other, try increasing the repel_force to increase the padding between annotations. 0.01
anno_alias dict snpid:text dictionary for customized annotation None

Repel force

image

Skip variants with -log10P<3 and annotate the lead variants with chr:pos

mysumstats.plot_mqq(skip=3,anno=True)

image

Skip variants with -log10P<3 and annotate the lead variants with GENENAME

mysumstats.plot_mqq(skip=3,anno="GENENAME",build="19")

image

Skip variants with -log10P<3 and annotate the variants in anno_set

mysumstats.plot_mqq(skip=3, anno_set=["rs12509595","19:15040733:T:C"])

image

Skip variants with -log10P<3 and annotate the variants in anno_set with alias in anno_alias

mysumstats.plot_mqq(skip=3, anno_set=["rs12509595","19:15040733:T:C"], anno_alias={"rs12509595":"anything you want here"})

image

Annotation style

GWASLab now support 3 types of annotation styles:

  • expand
  • right
  • tight

anno_style="expand"

image

anno_style="right"

image

anno_style="tight"

image

Adjust arm positions

Option DataType Description Default
anno_d dict key is the number of arm starting form 0, value is the direction you want the arm to shift towards . For example, anno_d = {4:"r"} means shift the 4th arm to the right None
arm_offset float distance in points 500
arm_scale float factors to adjust the height for all arms 1.0
arm_scale_d dict factors to adjust the height for specific arms. key is the number of arm startinf form 0, value is the factor which will be multiplied to arm height. None

Adjust the direction the first to left and the thrd to right

mysumstats.plot_mqq(skip=2,anno=True)

image

mysumstats.plot_mqq(skip=2,anno=True,          
                    anno_d={1:"l",3:"r"},
                    arm_offset=50)

image

Adjust the length of arm

mysumstats.plot_mqq(skip=2,anno=True,arm_scale=1.5)

image

Adjust the length of arm for each variant

mysumstats.plot_mqq(skip=2,anno=True,arm_scale_d={1:1.5,2:1.2,3:1.1})

image


Highlight loci

Highlight specified loci (color all variants in a region by specifying variants and the length of flanking regions).

Highlighting Option DataType Description Default
highlight list a list of SNPID or rsID; these loci (all variants in the specified variants positions +/- highlight_windowkb) will be highlighted in pinpoint_color True
highlight_windowkb int Specify the span of highlighted region in kbp 500
highlight_color list Color for highlighting loci "#CB132D"

Pinpoint variants

Pinpoint certain variants in the Manhattan plot.

Pinpointing Option DataType Description Default
pinpoint list a list of SNPID or rsID; these variants will be highlighted in pinpoint_color True
pinpoint_color list color for pinpointing variants "red"

Highlight loci and pinpoint variants

mysumstats.plot_mqq(skip=3,anno="GENENAME",build="19",
                   highlight=["rs12509595","rs7989823"],
                   pinpoint=["rs671","19:15040733:T:C"])

image


Lines

Line Option DataType Description Default
sig_line boolean If True, plot the significant threshold line True
sig_level float The significance threshold 5e-8
sig_level_lead float The significance threshold for extracting lead variants to annotate 5e-8
sig_line_color string If True, plot the significant threshold line True
suggestive_sig_line boolean If True, plot the suggestive threshold line True
suggestive_sig_level float The suggestive threshold 5e-6
suggestive_sig_line_color string Suggestive level line color "grey"
additional_line list list of P values used to plot additional lines None
additional_line_color list list of colors for the additional lines None
cut_line_color string If True, plot the significant threshold line "#ebebeb"

Plot lines

mysumstats.plot_mqq(skip=3,
                build="19",
                anno="GENENAME",
                windowsizekb=1000000,
                cut=20,
                cut_line_color="purple",
                sig_level=5e-8,  
                sig_level_lead=1e-6, 
                sig_line_color="grey",
                suggestive_sig_line = True,
                suggestive_sig_level = 1e-6,
                suggestive_sig_line_color="blue",
                additional_line=[1e-40,1e-60],
                additional_line_color=["yellow","green"])
image

MAF-stratified QQ plot

QQ plot Option DataType Description Default
stratified boolean if True, plot MAF straitified QQ plot. Require EAF in sumstats. False
maf_bins list MAF bins for straitification. [(0, 0.01), (0.01, 0.05), (0.05, 0.25),(0.25,0.5)]
maf_bin_colors list colors used for each MAF bin. ["#f0ad4e","#5cb85c", "#5bc0de","#000042"]

MAF-stratified Q-Q plot

image


Colors and Fontsizes

mysumstats.plot_mqq(
          colors=["#597FBD","#74BAD3"],
          cut_line_color="#ebebeb",
          sig_line_color="grey",
          highlight_color="#CB132D",
          pinpoint_color ="red",
          maf_bin_colors = ["#f0ad4e","#5cb85c", "#5bc0de","#000042"],
          fontsize = 10,
          anno_fontsize = 10,
          title_fontsize = 13,
          marker_size=(5,25)
)

Color-related options

Color Option DataType Description Default
colors list a list of colors for chromsomes in the Manhattan plot; it will be used repetitively. ["#597FBD","#74BAD3"]
cut_line_color string color for the cut line. "#EBEBEB"
sig_line_color string color for significance threshold line. "grey"
highlight_color string color for highlighting loci "#CB132D"
pinpoint_color string color for pinpointing variants "red"
maf_bin_colors list a list of colors for maf-stratified Q-Q plot. ["#f0ad4e","#5cb85c", "#5bc0de","#000042"]

Font-related options

Font Option DataType Description Default
fontsize list fontsize for ticklabels. 9
title_fontsize 13 fontsize for title. 13
anno_fontsize 10 fontsize for annotation. 9
font_family string font family "Arial"

Example

mysumstats.plot_mqq(skip=2,
                    cut=20,
                    colors=sns.color_palette("Set3"),
                    sig_line_color="red",
                    fontsize = 8)
image

Titles

mysumstats.plot_mqq(
          title =None,
          mtitle=None,
          qtitle=None,
          title_pad=1.08
        )
Title Option DataType Description Default
title string title for the figure. ``
mtitle string title for the Manhattan plot ``
qtitle string title for the Q-Q plot ``
title_pad float padding for title 1.08

image


Figure settings

figargs= {"figsize":(15,5),"dpi":100}
Figure Option DataType Description Default
figargs dict key-values pairs that are passed to matplotlib plt.subplots() {"figsize":(15,5),"dpi":200}

Commonly used ones:

  • figsize : figure size
  • dpi : dots per inch. For pulications, dpi>=300 is on of the common criteria.

Saving plots

mysumstats.plot_mqq(save="mymqqplots.png",save_args={"dpi":400,"facecolor":"white"})

Two options for saving plots in .plot_mqq

Saving Option DataType Description Default
save string or boolean If string, the plot will be saved to the specified path; If True, it will be saved to default path True
save_args dict other parameters passed to matplotlib savefig function. {"dpi":300,"facecolor":"white"}

Example

  • save as png: mysumstats.plot_mqq(save="mymqqplots.png",save_args={"dpi":300})
  • save as PDF: mysumstats.plot_mqq(save="mymqqplots.pdf",save_args={"dpi":300})