Brisbane plot

Brisbane plot: GWAS signal density plot

GWASLab can create the Brisbane plot (GWAS signal density plot). Brisbane plot is a scatter plot that shows the signal density (number of variants within a specified window around each variant) for each variant, which is very useful for presenting the independent signals obtained from large-scale GWAS of complex traits. The signals are usually determined by other statistical methods such as conditional analysis.

Key features:

X-axis: Genomic position (same as Manhattan plot)
Y-axis: Signal density (number of neighboring variants within the window)
Purpose: Visualize the density of independent signals across the genome

Independent Signals

To create a meaningful Brisbane plot, you should load sumstats containing only independent signals (e.g., from conditional analysis). If you load the entire dataset, the plot will simply reflect the marker density for your sumstats. To investigate independent signals, please use other tools such as GCTA-COJO. GWASLab only calculates the density of all variants in the gl.Sumstats Object.

.plot_mqq(mode="b")

mysumstats.plot_mqq(mode="b")

Creates a Brisbane plot showing signal density across the genome.

Options

Option	DataType	Description	Default
`mode`	`str`	Plotting mode. Use `"b"` for Brisbane plot only	`"b"`
`bwindowsizekb`	`int`	Window size in kb (flanking region length on one side). Total window = 2 × bwindowsizekb	`100`
`density_color`	`bool`	If True, color variants by density value. When True, `density_palette` overrides `colors`	`False`
`density_palette`	`str`	Color palette for density coloring (e.g., 'Reds', 'Blues', 'viridis')	`'Reds'`
`density_range`	`tuple`	Color range for density plot. If None, auto-selected as (min(DENSITY), max(DENSITY))	`None`
`density_threshold`	`int`	Threshold for density coloring. Above threshold uses `density_palette`, below uses `density_tpalette`	`5`
`density_trange`	`tuple`	Threshold range for density plot (for variants below threshold)	`(0,10)`
`density_tpalette`	`str`	Color palette for density below threshold	`'Blues'`
`anno`	`str`	Annotation style. Options: `None`, `True` (chr:pos), `'GENENAME'`, or column name	`None`
`anno_set`	`list`	List of variant IDs to annotate	`None`
`windowsizekb`	`int`	Window size in kb for determining lead variants for annotation	`500`

Example

Basic Brisbane plot

mysumstats.plot_mqq(mode="b")

Brisbane plot with custom window size

mysumstats.plot_mqq(mode="b", bwindowsizekb=250)

Brisbane plot with density coloring

mysumstats.plot_mqq(
    mode="b",
    density_color=True,
    density_palette="Reds"
)

Manhattan-Brisbane layout

mysumstats.plot_mqq(mode="mb")

Brisbane plot with annotations

mysumstats.plot_mqq(
    mode="b",
    anno="GENENAME",
    windowsizekb=500
)

See more examples here

Calculate signal density

You can use .get_density() to calculate the density without creating a plot. This adds a DENSITY column to your sumstats.

mysumstats.get_density(windowsizekb=100)

Parameters

Option	DataType	Description	Default
`windowsizekb`	`int`	Window size in kb for calculation of signal density. Total window = 2 × windowsizekb (flanking on each side)	`100`
`sig_list`	`DataFrame`, optional	If provided, density is calculated based on significant variants in this list (for conditional analysis)	`None`

How it works

The function calculates density by counting the number of variants within a window around each variant: - For each variant at position P, it counts variants in the range [P - windowsizekb, P + windowsizekb] - The density value represents how many neighboring variants are within this window - Higher density values indicate regions with more clustered signals

Output

The function adds a DENSITY column to mysumstats.data and prints summary statistics: - Mean density - Median density - Standard deviation - Maximum density and the variant with maximum density

Calculate signal density

mysumstats.get_density(windowsizekb=100)

# Output:
# -Calculating DENSITY with windowsize of 100 kb
# -Mean : 2.345 signals per 100 kb
# -SD : 1.234
# -Median : 2.0 signals per 100 kb
# -Max : 15 signals per 100 kb at variant rs11555194

mysumstats.data
    SNPID   CHR POS P   STATUS  DENSITY
0   rs2710888   1   959842  2.190000e-57    9999999 1
1   rs3934834   1   1005806 2.440000e-29    9999999 1
2   rs182532    1   1287040 1.250000e-18    9999999 1
3   rs17160669  1   1305561 1.480000e-28    9999999 1
4   rs9660106   1   1797947 1.860000e-12    9999999 0
... ... ... ... ... ... ...
12106   rs9628283   22  50540766    5.130000e-15    9999999 1
12107   rs28642259  22  50785718    1.140000e-13    9999999 1
12108   rs11555194  22  50876662    2.000000e-15    9999999 2
12109   rs762669    22  50943423    3.000000e-30    9999999 1
12110   rs9628185   22  51109992    5.430000e-12    9999999 0

Calculate density from significant variants list

# For conditional analysis: calculate density based on pre-identified significant variants
significant_variants = mysumstats.data[mysumstats.data["P"] < 5e-8]
mysumstats.get_density(windowsizekb=100, sig_list=significant_variants)

Extract top variants by density

You can use .get_top() to extract variants with the highest density values within sliding windows. This is useful for identifying the most dense signal regions in your Brisbane plot.

mysumstats.get_top(by="DENSITY", windowsizekb=500)

Parameters

Option	DataType	Description	Default
`by`	`str`	Column name whose values are maximized to choose top variants. Default is `"DENSITY"`	`"DENSITY"`
`threshold`	`float`, optional	If provided, only variants with `by` >= `threshold` are considered. If None, uses median of maximum values per chromosome	`None`
`windowsizekb`	`int`	Sliding window size in kilobases used to determine locus boundaries	`500`
`bwindowsizekb`	`int`	Window size in kb for calculating density (only used if DENSITY column doesn't exist and `by="DENSITY"`)	`100`
`anno`	`bool`	If True, annotate output with nearest gene names	`False`
`gls`	`bool`	If True, return a new Sumstats object instead of DataFrame	`False`

How it works

The function identifies top variants by: 1. Calculating density (if by="DENSITY" and DENSITY column doesn't exist) 2. Filtering by threshold (if provided, or using median of chromosome maxima) 3. Sliding window approach: Within each window on a chromosome, selects the variant with the highest value of the specified metric 4. Non-overlapping windows: Ensures selected variants are separated by at least windowsizekb

This is similar to .get_lead() but doesn't rely on P-values - it can use any metric column (DENSITY, MLOG10P, BETA, etc.).

Return value

If gls=False: Returns a pandas DataFrame containing the selected top variants
If gls=True: Returns a new Sumstats object containing the selected top variants

Extract top density variants

# Extract variants with highest density in 500kb windows
top_density_variants = mysumstats.get_top(by="DENSITY", windowsizekb=500)

# Output:
# -Sliding window size for extracting top: 500 kb
# -Using DENSITY threshold: 3.5
# -Identified 25 top variants!

Extract top density variants with custom threshold

# Only consider variants with density >= 5
top_variants = mysumstats.get_top(
    by="DENSITY",
    threshold=5,
    windowsizekb=500
)

Extract top variants by other metrics

# Extract top variants by MLOG10P instead of DENSITY
top_p_variants = mysumstats.get_top(by="MLOG10P", windowsizekb=500)

# Extract top variants by BETA
top_beta_variants = mysumstats.get_top(by="BETA", windowsizekb=500)

Return as Sumstats object

# Get top density variants as a new Sumstats object
top_sumstats = mysumstats.get_top(by="DENSITY", gls=True)
top_sumstats.plot_mqq(mode="b")

Extract top variants with gene annotation

# Extract top density variants and annotate with gene names
top_variants = mysumstats.get_top(
    by="DENSITY",
    windowsizekb=500,
    anno=True
)

Reference

Citation for Brisbane plot

Yengo, L., Vedantam, S., Marouli, E., Sidorenko, J., Bartell, E., Sakaue, S., ... & Lee, J. Y. (2022). A saturated map of common genetic variants associated with human height. Nature, 1-16.