Manhattan and Q-Q plot
Load gwaslab
Load data into Sumstats Object
Example
| SNPID | CHR | POS | EA | NEA | STATUS | EAF | P |
|---|---|---|---|---|---|---|---|
| 1:725932_G_A | 1 | 725932 | G | A | 1995999 | 0.9960 | 0.5970 |
| 1:725933_A_G | 1 | 725933 | G | A | 1995999 | 0.0040 | 0.5973 |
| 1:737801_T_C | 1 | 737801 | C | T | 1995999 | 0.0051 | 0.6908 |
| 1:749963_T_TAA | 1 | 749963 | TAA | T | 1995999 | 0.8374 | 0.2846 |
| 1:751343_T_A | 1 | 751343 | T | A | 1995999 | 0.8593 | 0.2705 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| X:154874837_A_G | 23 | 154874837 | G | A | 1995999 | 0.7478 | 0.5840 |
| X:154875192_GTACTC_G | 23 | 154875192 | GTACTC | G | 1995999 | 0.2525 | 0.5612 |
| X:154879115_A_G | 23 | 154879115 | G | A | 1995999 | 0.7463 | 0.5646 |
| X:154880669_T_A | 23 | 154880669 | T | A | 1995999 | 0.2558 | 0.5618 |
| X:154880917_C_T | 23 | 154880917 | C | T | 1995999 | 0.2558 | 0.5570 |
Create Manhattan plot and QQ plot
stdout:
2025/12/25 23:38:18 Configured plot style for plot_mqq:mqq
2025/12/25 23:38:18 Starting Manhattan-QQ plot creation (Version v4.0.0)
2025/12/25 23:38:18 -Genomic coordinates are based on GRCh37/hg19...
2025/12/25 23:38:18 - Genomic coordinates version: 19 ...
2025/12/25 23:38:18 - Genome-wide significance level to plot is set to 5e-08 ...
2025/12/25 23:38:18 - Input sumstats contains 12557761 variants...
2025/12/25 23:38:18 - Manhattan-QQ plot layout mode selected: mqq
2025/12/25 23:38:19 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:38:20 -CHR data type is already numeric. Skipping conversion...
2025/12/25 23:38:20 Finished loading specified columns from the statistics
2025/12/25 23:38:20 Start data conversion and sanity check:
2025/12/25 23:38:25 -Sumstats P values are being converted to -log10(P)...
2025/12/25 23:38:26 -Converting data above cut line...
2025/12/25 23:38:26 -Maximum -log10(P) value is 167.58838029403677 .
2025/12/25 23:38:26 Finished data conversion and sanity check.
2025/12/25 23:38:26 Start to create Manhattan-QQ plot with 332882 variants...
2025/12/25 23:38:26 -Creating background plot...
2025/12/25 23:38:27 Finished creating Manhattan-QQ plot successfully
2025/12/25 23:38:27 Start to extract variants for annotation...
2025/12/25 23:38:27 -Found 89 significant variants with a sliding window size of 500 kb...
2025/12/25 23:38:27 Finished extracting variants for annotation...
2025/12/25 23:38:27 Start to create QQ plot with 332882 variants:
2025/12/25 23:38:27 -Plotting all variants...
2025/12/25 23:38:27 -Expected range of P: (0,1.0)
2025/12/25 23:38:28 -Lambda GC (MLOG10P mode) at 0.5 is 1.21283
2025/12/25 23:38:28 Finished creating QQ plot successfully!
2025/12/25 23:38:28 Finished creating plot successfully

gl.plot_mqq() Options
Layout mode
If plotting all variants, it may take several minutes. You can use skip to skip variants with low MLOG10P in the plot.
Note: use verbose=False to stop printing log and use check=False to skip sanity check for mqq plots
4 patterns of layout:
mode= "mqq"(default)mode= "qqm"mode= "qq"mode= "m"
stdout:
2025/12/25 23:38:32 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:38:34 -CHR data type is already numeric. Skipping conversion...

stdout:
2025/12/25 23:38:40 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:38:42 -CHR data type is already numeric. Skipping conversion...

stdout:
2025/12/25 23:38:48 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:38:49 -CHR data type is already numeric. Skipping conversion...


Y axis
skip
skip: skip the variants with low -log10(P) values for plotting
stdout:
2025/12/25 23:39:30 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:39:32 -CHR data type is already numeric. Skipping conversion...

cut
cut : scale down the -log10(P) for variants above a certain threshold
stdout:
2025/12/25 23:39:36 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:39:38 -CHR data type is already numeric. Skipping conversion...

Make the Y axis jagged to indicate that it has been rescale.
jagged
stdout:
2025/12/25 23:41:06 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:07 -CHR data type is already numeric. Skipping conversion...

X axis
use_rank
use_rank: if True, GWASLab will use position rank instead of the physical base-pair positions for x aixs.
There will be no gap if use_rank = True
stdout:
2025/12/25 23:41:12 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:13 -CHR data type is already numeric. Skipping conversion...

xtight
xtight=True can be used to remove the padding
stdout:
2025/12/25 23:41:18 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:19 -CHR data type is already numeric. Skipping conversion...

chrpad
chrpad: adjust space between each chromosome by max(POS) * chrpad
Example
stdout:
2025/12/25 23:41:23 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:25 -CHR data type is already numeric. Skipping conversion...

Annotation
anno=True
anno=True : annotate all lead variants with chr:pos
stdout:
2025/12/25 23:41:29 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:31 -CHR data type is already numeric. Skipping conversion...

Since there are a large number of novel loci, if we annotate all loci, it will be too messy. Let's only annotate the loci with P<1e-20 by specifying anno_sig_level=1e-20.
Example
stdout:
2025/12/25 23:41:36 Configured plot style for plot_mqq:m
2025/12/25 23:41:36 Starting Manhattan plot creation (Version v4.0.0)
2025/12/25 23:41:36 -Genomic coordinates are based on GRCh37/hg19...
2025/12/25 23:41:36 - Genomic coordinates version: 19 ...
2025/12/25 23:41:36 - Genome-wide significance level to plot is set to 5e-08 ...
2025/12/25 23:41:36 - Input sumstats contains 12557761 variants...
2025/12/25 23:41:36 - Manhattan plot layout mode selected: m
2025/12/25 23:41:36 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:38 -CHR data type is already numeric. Skipping conversion...
2025/12/25 23:41:38 Finished loading specified columns from the statistics
2025/12/25 23:41:38 Start data conversion and sanity check:
2025/12/25 23:41:38 -Sanity check will be skipped.
2025/12/25 23:41:39 -Sumstats P values are being converted to -log10(P)...
2025/12/25 23:41:41 -Converting data above cut line...
2025/12/25 23:41:41 -Maximum -log10(P) value is 167.58838029403677 .
2025/12/25 23:41:41 -Minus log10(P) values above 20 will be shrunk with a shrinkage factor of 10...
2025/12/25 23:41:41 Finished data conversion and sanity check.
2025/12/25 23:41:41 Start to create Manhattan plot with 91234 variants...
2025/12/25 23:41:41 -Creating background plot...
2025/12/25 23:41:41 Finished creating Manhattan plot successfully
2025/12/25 23:41:41 Start to extract variants for annotation...
2025/12/25 23:41:41 -Found 7 significant variants with a sliding window size of 500 kb...
2025/12/25 23:41:41 Finished extracting variants for annotation...
2025/12/25 23:41:41 -Annotating using column CHR:POS...
2025/12/25 23:41:41 -Adjusting text positions with repel_force=0.03...
2025/12/25 23:41:41 Finished creating plot successfully

anno="GENENAME"
anno="GENENAME": automatically annotate the nearest gene name
Note: remember to set build=19 or build=38 when loading or plotting.
Example
stdout:
2025/12/25 23:41:42 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:44 -CHR data type is already numeric. Skipping conversion...

You can specify anno_gtf_path to use your own GTF file for GENENAME annotation
Example
anno_d
We can use anno_d to slightly adjust the arrows.
anno_d accepts a dictionary of index of annotation: left/right
For example, 1:"left" means to adjust towards left.
Example
stdout:
2025/12/25 23:41:53 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:41:54 -CHR data type is already numeric. Skipping conversion...

anno_scale
We can also use arm_scale to adjust where to put the annotation texts.
For example, arm_scale=1.2 means the default length will be multiplied by a factor of 1.2.
Example
stdout:
2025/12/25 23:41:59 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:42:01 -CHR data type is already numeric. Skipping conversion...

arm_scale_d accepts a dictionary of index of annotation: arm_scale
For example, 1:1.2 means to adjust the arm of the second by a factor of 1.2.
Example
stdout:
2025/12/25 23:42:07 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:42:08 -CHR data type is already numeric. Skipping conversion...

anno_style
GWASLab provides three types of different annotation styles
anno_style="right", anno_style="expand", and anno_style="tight"
Example
stdout:
2025/12/25 23:42:14 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:42:15 -CHR data type is already numeric. Skipping conversion...

Example
stdout:
2025/12/25 23:42:21 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:42:22 -CHR data type is already numeric. Skipping conversion...

anno_set
If we want to annotate only a subset of variants, we can pass a list of variant IDs to anno_set.
Let's check all lead variants and select only two to annotate.
| SNPID | CHR | POS | EA | NEA | STATUS | EAF | P |
|---|---|---|---|---|---|---|---|
| 11:2858546_C_T | 11 | 2858546 | C | T | 1995999 | 0.6209 | 2.580000e-168 |
| 9:22132729_A_G | 9 | 22132729 | G | A | 1995999 | 0.4367 | 9.848000e-88 |
| 6:20688121_T_A | 6 | 20688121 | T | A | 1995999 | 0.5758 | 2.062000e-85 |
| 7:127253550_C_T | 7 | 127253550 | C | T | 1995999 | 0.9081 | 4.101000e-74 |
| X:152908887_G_A | 23 | 152908887 | G | A | 1995999 | 0.6792 | 9.197000e-58 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| X:21569920_A_G | 23 | 21569920 | G | A | 1995999 | 0.3190 | 2.616000e-08 |
| 6:7226959_C_T | 6 | 7226959 | C | T | 1995999 | 0.6657 | 2.849000e-08 |
| 15:90393949_C_CT | 15 | 90393949 | C | CT | 1995999 | 0.3445 | 3.134000e-08 |
| 1:154309595_TA_T | 1 | 154309595 | TA | T | 1995999 | 0.0947 | 3.289000e-08 |
| 17:40913366_C_T | 17 | 40913366 | C | T | 1995999 | 0.4707 | 4.159000e-08 |
[89 rows x 8 columns]
This time, let's annotate 1:154309595_TA_T and 2:27734972_G_A with their nearest gene names!
anno_set : the set of variants you want to annotate
Example
stdout:
2025/12/25 23:42:55 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:42:56 -CHR data type is already numeric. Skipping conversion...

anno_alias
anno_alias accepts a dictionary of SNPID:string. You can use this to customize the text for annotation.
Example
stdout:
2025/12/25 23:43:03 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:43:04 -CHR data type is already numeric. Skipping conversion...

Highlight loci & Pinpoint variants (single group)
highlight: a variant list of loci you want to highlightpinpoint: a variant list of variants you want to pinpoint
Example
stdout:
2025/12/25 23:43:11 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:43:13 -CHR data type is already numeric. Skipping conversion...

Highlight loci & Pinpoint variants (multi-group)
Instead of a list, you can provide a list of lists. Each member list is then a group.
Example
stdout:
2025/12/25 23:43:20 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:43:22 -CHR data type is already numeric. Skipping conversion...

MAF-stratified QQ plot
stdout:
2025/12/25 23:43:29 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:43:30 -CHR data type is already numeric. Skipping conversion...
2025/12/25 23:43:30 -EAF data type is already numeric. Skipping conversion...

Auxiliary lines
Example
mysumstats.plot_mqq(skip=3,
build="19",
anno="GENENAME",
windowsizekb=1000000,
cut=20,
cut_line_color="purple",
sig_level=5e-8,
anno_sig_level=1e-6,
sig_line_color="grey",
suggestive_sig_line = True,
suggestive_sig_level = 1e-6,
suggestive_sig_line_color="blue",
additional_line=[1e-40,1e-60],
additional_line_color=["yellow","green"],
mode= "m",check=False,verbose=False)
stdout:
2025/12/25 23:43:38 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:43:40 -CHR data type is already numeric. Skipping conversion...

Font and marker size
fontsizeanno_fontsizetitle_fontsizemarker_size
Example
stdout:
2025/12/25 23:43:47 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:43:48 -CHR data type is already numeric. Skipping conversion...

Colors
colorscut_line_colorsig_line_colorhighlight_colorpinpoint_colormaf_bin_colors
Example
mysumstats.plot_mqq(skip=3,
cut=20,
stratified=True,
highlight=["7:127253550_C_T"],
pinpoint=["2:27734972_G_A"],
colors=["orange","blue"],
cut_line_color="yellow",
sig_line_color="red",
highlight_color="purple",
pinpoint_color ="green",
maf_bin_colors = ["#FFE2D1","#E1F0C4", "#6BAB90","#55917F"],
check=False,verbose=False
)
stdout:
2025/12/25 23:43:55 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:43:56 -CHR data type is already numeric. Skipping conversion...
2025/12/25 23:43:56 -EAF data type is already numeric. Skipping conversion...

Save plots
Example
stdout:
2025/12/25 23:44:06 Configured plot style for plot_mqq:mqq
2025/12/25 23:44:06 Starting Manhattan-QQ plot creation (Version v4.0.0)
2025/12/25 23:44:06 -Genomic coordinates are based on GRCh37/hg19...
2025/12/25 23:44:06 - Genomic coordinates version: 19 ...
2025/12/25 23:44:06 - Genome-wide significance level to plot is set to 5e-08 ...
2025/12/25 23:44:06 - Input sumstats contains 12557761 variants...
2025/12/25 23:44:06 - Manhattan-QQ plot layout mode selected: mqq
2025/12/25 23:44:06 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:44:08 -CHR data type is already numeric. Skipping conversion...
2025/12/25 23:44:08 -EAF data type is already numeric. Skipping conversion...
2025/12/25 23:44:08 Finished loading specified columns from the statistics
2025/12/25 23:44:08 Start data conversion and sanity check:
2025/12/25 23:44:13 -Removed 0 variants with nan in EAF column ...
2025/12/25 23:44:14 -Sumstats P values are being converted to -log10(P)...
2025/12/25 23:44:16 -Converting data above cut line...
2025/12/25 23:44:16 -Maximum -log10(P) value is 167.58838029403677 .
2025/12/25 23:44:16 -Minus log10(P) values above 20 will be shrunk with a shrinkage factor of 10...
2025/12/25 23:44:16 Finished data conversion and sanity check.
2025/12/25 23:44:16 Start to create Manhattan-QQ plot with 91234 variants...
2025/12/25 23:44:16 -Creating background plot...
2025/12/25 23:44:17 Finished creating Manhattan-QQ plot successfully
2025/12/25 23:44:17 Start to extract variants for annotation...
2025/12/25 23:44:17 -Found 89 significant variants with a sliding window size of 500 kb...
2025/12/25 23:44:17 Finished extracting variants for annotation...
2025/12/25 23:44:17 Start to create QQ plot with 91234 variants:
2025/12/25 23:44:17 -Plotting variants stratified by MAF...
2025/12/25 23:44:18 -Lambda GC (MLOG10P mode) at 0.5 is 1.21283
2025/12/25 23:44:18 Finished creating QQ plot successfully!
2025/12/25 23:44:18 Start to save figure...
2025/12/25 23:44:20 -Saved to my_maf_stratified_mqq_plot.png successfully! (png) (overwrite)
2025/12/25 23:44:20 Finished saving figure...
2025/12/25 23:44:20 Finished creating plot successfully

Fig object
plot_mqq will return a matplotlib figure object
stdout:
2025/12/25 23:44:41 -POS data type is already numeric. Skipping conversion...
2025/12/25 23:44:42 -CHR data type is already numeric. Skipping conversion...
2025/12/25 23:44:42 -EAF data type is already numeric. Skipping conversion...
