Output sumstats¶

In [1]:

            
                Copied!
                
import gwaslab as gl
import gwaslab as gl

Loading data¶

In [2]:

            
                Copied!
                
                    
                    
                
                

        
mysumstats = gl.Sumstats("t2d_bbj.txt.gz",
             snpid="SNP",
             chrom="CHR",
             pos="POS",
             ea="ALT",
             nea="REF",
             neaf="Frq",
             beta="BETA",
             se="SE",
             p="P",
             direction="Dir",
             build="19",
             n="N", verbose=False)

# select just 1000 variants for example
mysumstats.random_variants(n=1000, inplace=True, random_state=123,verbose=False)

# basic_check
mysumstats.basic_check(verbose=False)
mysumstats = gl.Sumstats("t2d_bbj.txt.gz",
             snpid="SNP",
             chrom="CHR",
             pos="POS",
             ea="ALT",
             nea="REF",
             neaf="Frq",
             beta="BETA",
             se="SE",
             p="P",
             direction="Dir",
             build="19",
             n="N", verbose=False)

# select just 1000 variants for example
mysumstats.random_variants(n=1000, inplace=True, random_state=123,verbose=False)

# basic_check
mysumstats.basic_check(verbose=False)

Check available formats¶

List the formats that GWASLab supports

In [3]:

            
                Copied!
                
gl.list_formats()
gl.list_formats()

Sat Feb  3 13:29:25 2024 Available formats: auto,bolt_lmm,cojo,fastgwa,gwascatalog,gwascatalog_hm,gwaslab,ldsc,metal,mrmega,mtag,pgscatalog,pgscatalog_hm,pheweb,plink,plink2,plink2_firth,plink2_linear,plink2_logistic,plink_assoc,plink_bim,plink_dosage,plink_fam,plink_fisher,plink_linear,plink_logistic,plink_psam,plink_pvar,popcorn,regenie,regenie_gene,saige,ssf,template,vcf

Check the contents of the specified format

In [4]:

            
                Copied!
                
gl.check_format("ssf")
gl.check_format("ssf")

Sat Feb  3 13:29:25 2024 Available formats:Sat Feb  3 13:29:25 2024 meta_dataSat Feb  3 13:29:25 2024 format_dictSat Feb  3 13:29:25 2024 
Sat Feb  3 13:29:25 2024 {'format_name': 'ssf', 'format_source': 'https://www.biorxiv.org/content/10.1101/2022.07.15.500230v1.full', 'format_cite_name': 'GWAS-SSF v0.1', 'format_separator': '\t', 'format_na': '#NA', 'format_comment': None, 'format_col_order': ['chromosome', 'base_pair_location', 'effect_allele', 'other_allele', 'beta', 'odds_ratio', 'hazard_ratio', 'standard_error', 'effect_allele_frequency', 'p_value', 'neg_log_10_p_value', 'ci_upper', 'ci_lower', 'rsid', 'variant_id', 'info', 'ref_allele', 'n'], 'format_version': 20230328}Sat Feb  3 13:29:25 2024 {'variant_id': 'SNPID', 'rsid': 'rsID', 'chromosome': 'CHR', 'base_pair_location': 'POS', 'other_allele': 'NEA', 'effect_allele': 'EA', 'effect_allele_frequency': 'EAF', 'n': 'N', 'beta': 'BETA', 'standard_error': 'SE', 'p_value': 'P', 'neg_log_10_p_value': 'MLOG10P', 'info': 'INFO', 'odds_ratio': 'OR', 'hazard_ratio': 'HR', 'ci_lower': 'OR_95L', 'ci_upper': 'OR_95U'}

Formatting and saving¶

get ready for submission to gwas catalog (GWAS-ssf format)¶

fmt: specify the output format
ssfmeta: if True, output the meta file
md5sum: if True, create a file with the md5sum of the output sumstats

In [5]:

            
                Copied!
                
mysumstats.to_format("./mysumstats", fmt="ssf", ssfmeta=True, md5sum=True)
mysumstats.to_format("./mysumstats", fmt="ssf", ssfmeta=True, md5sum=True)

Sat Feb  3 13:29:25 2024 Start to convert the output sumstats in:  ssf  format
Sat Feb  3 13:29:25 2024  -Formatting statistics ...
Sat Feb  3 13:29:25 2024  -Float statistics formats:
Sat Feb  3 13:29:25 2024   - Columns       : ['EAF', 'BETA', 'SE', 'P']
Sat Feb  3 13:29:25 2024   - Output formats: ['{:.4g}', '{:.4f}', '{:.4f}', '{:.4e}']
Sat Feb  3 13:29:25 2024  -Replacing SNPID separator from ":" to "_"...
Sat Feb  3 13:29:25 2024  -Start outputting sumstats in ssf format...
Sat Feb  3 13:29:25 2024  -ssf format will be loaded...
Sat Feb  3 13:29:25 2024  -ssf format meta info:
Sat Feb  3 13:29:25 2024   - format_name  : ssf
Sat Feb  3 13:29:25 2024   - format_source  : https://www.biorxiv.org/content/10.1101/2022.07.15.500230v1.full
Sat Feb  3 13:29:25 2024   - format_cite_name  : GWAS-SSF v0.1
Sat Feb  3 13:29:25 2024   - format_separator  : \t
Sat Feb  3 13:29:25 2024   - format_na  : #NA
Sat Feb  3 13:29:25 2024   - format_col_order  : chromosome,base_pair_location,effect_allele,other_allele,beta,odds_ratio,hazard_ratio,standard_error,effect_allele_frequency,p_value,neg_log_10_p_value,ci_upper,ci_lower,rsid,variant_id,info,ref_allele,n
Sat Feb  3 13:29:25 2024   - format_version  :  20230328
Sat Feb  3 13:29:25 2024  -gwaslab to ssf format dictionary:
Sat Feb  3 13:29:25 2024   - gwaslab keys: SNPID,rsID,CHR,POS,NEA,EA,EAF,N,BETA,SE,P,MLOG10P,INFO,OR,HR,OR_95L,OR_95U
Sat Feb  3 13:29:25 2024   - ssf values: variant_id,rsid,chromosome,base_pair_location,other_allele,effect_allele,effect_allele_frequency,n,beta,standard_error,p_value,neg_log_10_p_value,info,odds_ratio,hazard_ratio,ci_lower,ci_upper
Sat Feb  3 13:29:25 2024  -Output path: ./mysumstats.ssf.tsv.gz
Sat Feb  3 13:29:25 2024  -Output columns: chromosome,base_pair_location,effect_allele,other_allele,beta,standard_error,effect_allele_frequency,p_value,variant_id,n
Sat Feb  3 13:29:25 2024  -Writing sumstats to: ./mysumstats.ssf.tsv.gz...
Sat Feb  3 13:29:25 2024  -md5sum hashing for the file: ./mysumstats.ssf.tsv.gz
Sat Feb  3 13:29:25 2024  -md5sum path: ./mysumstats.ssf.tsv.gz.md5sum
Sat Feb  3 13:29:25 2024  -md5sum: 2f24217183ce33e1c906a9a7c9d8e2dc
Sat Feb  3 13:29:25 2024  -Exporting SSF-style meta data to ./mysumstats.ssf.tsv-meta.ymal
Sat Feb  3 13:29:25 2024  -Saving log file to: ./mysumstats.ssf.log
Sat Feb  3 13:29:25 2024 Finished outputting successfully!

In [6]:

            
                Copied!
                
!zcat mysumstats.ssf.tsv.gz | head
!zcat mysumstats.ssf.tsv.gz | head

chromosome	base_pair_location	effect_allele	other_allele	beta	standard_error	effect_allele_frequency	p_value	variant_id	n
1	2005486	C	T	-0.0969	0.0471	0.9863	3.9820e-02	1_2005486_C_T	191764
1	2247939	AAGG	A	0.0330	0.1249	0.9966	7.9190e-01	1_2247939_AAGG_A	191764
1	3741853	G	A	-0.0375	0.0142	0.8849	8.2820e-03	1_3741853_G_A	191764
1	5017526	G	A	0.0126	0.0373	0.9822	7.3620e-01	1_5017526_G_A	191764
1	5843475	C	T	-0.0011	0.0433	0.9857	9.8010e-01	1_5843475_C_T	191764
1	9405103	C	T	-0.0729	0.1516	0.0021	6.3050e-01	1_9405103_T_C	191764
1	9443411	G	A	0.0362	0.0532	0.9916	4.9690e-01	1_9443411_G_A	191764
1	12866348	G	C	-0.0352	0.0431	0.9728	4.1450e-01	1_12866348_G_C	191764
1	14466316	G	A	-0.0042	0.0096	0.6942	6.6360e-01	1_14466316_A_G	191764

gzip: stdout: Broken pipe

In [7]:

            
                Copied!
                
!head mysumstats.ssf.tsv.gz.md5sum
!head mysumstats.ssf.tsv.gz.md5sum

2f24217183ce33e1c906a9a7c9d8e2dc

In [8]:

            
                Copied!
                
!head ./mysumstats.ssf.tsv-meta.ymal
!head ./mysumstats.ssf.tsv-meta.ymal

coordinate_system: 1-based
data_file_md5sum: 2f24217183ce33e1c906a9a7c9d8e2dc
data_file_name: ./mysumstats.ssf.tsv.gz
date_last_modified: 2024-02-03-13:29:25
file_type: GWAS-SSF v0.1
genome_assembly: Unknown
genotyping_technology: Unknown
gwas_id: Unknown
gwaslab:
  genome_build: '19'

ldsc default format¶

hapmap3: if True, only output hapmap3 SNPs
exclude_hla: if True, exclude variants in HLA region from output

In [9]:

            
                Copied!
                
mysumstats.to_format("./mysumstats",fmt="ldsc",hapmap3=True,exclude_hla=True)
mysumstats.to_format("./mysumstats",fmt="ldsc",hapmap3=True,exclude_hla=True)

Sat Feb  3 13:29:26 2024 Start to convert the output sumstats in:  ldsc  format
Sat Feb  3 13:29:26 2024  -Excluding variants in MHC (HLA) region ...
Sat Feb  3 13:29:26 2024  -Exclude 3 variants in MHC (HLA) region : 25Mb - 34Mb.
Sat Feb  3 13:29:26 2024  -Processing 997 raw variants...
Sat Feb  3 13:29:26 2024  -Loading Hapmap3 variants data...
Sat Feb  3 13:29:27 2024  -Since rsID not in sumstats, chr:pos( build 19) will be used for matching...
Sat Feb  3 13:29:28 2024  -Raw input contains 81 hapmaps variants based on chr:pos...
Sat Feb  3 13:29:28 2024  -Extract 81 variants in Hapmap3 datasets for build 19.
Sat Feb  3 13:29:28 2024  -Formatting statistics ...
Sat Feb  3 13:29:28 2024  -Float statistics formats:
Sat Feb  3 13:29:28 2024   - Columns       : ['EAF', 'BETA', 'SE', 'P']
Sat Feb  3 13:29:28 2024   - Output formats: ['{:.4g}', '{:.4f}', '{:.4f}', '{:.4e}']
Sat Feb  3 13:29:28 2024  -Start outputting sumstats in ldsc format...
Sat Feb  3 13:29:28 2024  -ldsc format will be loaded...
Sat Feb  3 13:29:28 2024  -ldsc format meta info:
Sat Feb  3 13:29:28 2024   - format_name  : ldsc
Sat Feb  3 13:29:28 2024   - format_source  : https://github.com/bulik/ldsc/wiki/Summary-Statistics-File-Format
Sat Feb  3 13:29:28 2024   - format_source2  : https://github.com/bulik/ldsc/blob/master/munge_sumstats.py
Sat Feb  3 13:29:28 2024   - format_version  :  20150306
Sat Feb  3 13:29:28 2024  -gwaslab to ldsc format dictionary:
Sat Feb  3 13:29:28 2024   - gwaslab keys: rsID,NEA,EA,EAF,N,BETA,P,Z,INFO,OR,CHR,POS
Sat Feb  3 13:29:28 2024   - ldsc values: SNP,A2,A1,Frq,N,Beta,P,Z,INFO,OR,CHR,POS
Sat Feb  3 13:29:28 2024  -Output path: ./mysumstats.hapmap3.noMHC.ldsc.tsv.gz
Sat Feb  3 13:29:28 2024  -Output columns: N,POS,CHR,A1,A2,SNP,P,Beta,Frq
Sat Feb  3 13:29:28 2024  -Writing sumstats to: ./mysumstats.hapmap3.noMHC.ldsc.tsv.gz...
Sat Feb  3 13:29:28 2024  -Saving log file to: ./mysumstats.hapmap3.noMHC.ldsc.log
Sat Feb  3 13:29:28 2024 Finished outputting successfully!

In [10]:

            
                Copied!
                
!zcat ./mysumstats.hapmap3.noMHC.ldsc.tsv.gz | head
!zcat ./mysumstats.hapmap3.noMHC.ldsc.tsv.gz | head

N	POS	CHR	A1	A2	SNP	P	Beta	Frq
191764	14900419	1	G	A	rs6703840	1.3750e-01	0.0144	0.3952
191764	19593199	1	C	T	rs7527253	3.2570e-01	-0.0127	0.1323
191764	35282297	1	G	A	rs1407135	6.4190e-01	0.0041	0.5434
191764	66001402	1	C	T	rs1171261	1.7720e-01	-0.0148	0.2103
191764	83510491	1	G	A	rs2022427	6.9800e-01	0.0378	0.0025
191764	166110693	1	C	T	rs4656480	2.5250e-02	0.0286	0.8627
191764	175886511	1	G	A	rs6656281	2.2480e-01	-0.0141	0.1828
191764	181612041	1	C	T	rs199955	5.5050e-01	0.0135	0.9603
191764	196329362	1	C	T	rs11801881	2.5060e-01	0.0300	0.0301

vcf¶

bgzip : if True, bgzip the output vcf/bed
tabix : if True, index the bgzipped file with tabix

In [11]:

            
                Copied!
                
mysumstats.to_format("./mysumstats",fmt="vcf",bgzip=True,tabix=True)
mysumstats.to_format("./mysumstats",fmt="vcf",bgzip=True,tabix=True)

Sat Feb  3 13:29:28 2024 Start to convert the output sumstats in:  vcf  format
Sat Feb  3 13:29:28 2024  -Formatting statistics ...
Sat Feb  3 13:29:28 2024  -Float statistics formats:
Sat Feb  3 13:29:28 2024   - Columns       : ['EAF', 'BETA', 'SE', 'P']
Sat Feb  3 13:29:28 2024   - Output formats: ['{:.4g}', '{:.4f}', '{:.4f}', '{:.4e}']
Sat Feb  3 13:29:28 2024  -Start outputting sumstats in vcf format...
Sat Feb  3 13:29:28 2024  -vcf format will be loaded...
Sat Feb  3 13:29:28 2024  -vcf format meta info:
Sat Feb  3 13:29:28 2024   - format_name  : vcf
Sat Feb  3 13:29:28 2024   - format_source  : https://github.com/MRCIEU/gwas-vcf-specification/tree/1.0.0
Sat Feb  3 13:29:28 2024   - format_version  :  20220923
Sat Feb  3 13:29:28 2024   - format_citation  : Lyon, M.S., Andrews, S.J., Elsworth, B. et al. The variant call format provides efficient and robust storage of GWAS summary statistics. Genome Biol 22, 32 (2021). https://doi.org/10.1186/s13059-020-02248-0
Sat Feb  3 13:29:28 2024   - format_fixed  : #CHROM,POS,ID,REF,ALT,QUAL,FILTER,INFO,FORMAT
Sat Feb  3 13:29:28 2024   - format_format  : ID,SS,ES,SE,LP,SI,EZ
Sat Feb  3 13:29:28 2024  -gwaslab to vcf format dictionary:
Sat Feb  3 13:29:28 2024   - gwaslab keys: rsID,CHR,POS,NEA,EA,N,EAF,BETA,SE,MLOG10P,INFO,Z
Sat Feb  3 13:29:28 2024   - vcf values: ID,#CHROM,POS,REF,ALT,SS,AF,ES,SE,LP,SI,EZ
Sat Feb  3 13:29:28 2024  -Creating VCF file header...
Sat Feb  3 13:29:28 2024   -VCF header contig build:19
Sat Feb  3 13:29:28 2024   -ID:Study_1
Sat Feb  3 13:29:28 2024   -StudyType:Unknown
Sat Feb  3 13:29:28 2024   -TotalVariants:1000
Sat Feb  3 13:29:28 2024   -HarmonisedVariants:0
Sat Feb  3 13:29:28 2024   -VariantsNotHarmonised:1000
Sat Feb  3 13:29:28 2024   -SwitchedAlleles:0
Sat Feb  3 13:29:28 2024  -Writing sumstats to: ./mysumstats.vcf...
Sat Feb  3 13:29:28 2024  -Output columns: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Study_1
Sat Feb  3 13:29:28 2024  -Outputing data...
Sat Feb  3 13:29:29 2024  -bgzip compressing : ./mysumstats.vcf.gz...
Sat Feb  3 13:29:29 2024  -tabix indexing : : ./mysumstats.vcf.gz.tbi...
Sat Feb  3 13:29:29 2024  -Saving log file to: ./mysumstats.vcf.log
Sat Feb  3 13:29:29 2024 Finished outputting successfully!

For annotation¶

convert to bed format¶

In [12]:

            
                Copied!
                
mysumstats.to_format("./mysumstats",fmt="bed")
mysumstats.to_format("./mysumstats",fmt="bed")

Sat Feb  3 13:29:29 2024 Start to convert the output sumstats in:  bed  format
Sat Feb  3 13:29:29 2024  -Formatting statistics ...
Sat Feb  3 13:29:29 2024  -Float statistics formats:
Sat Feb  3 13:29:29 2024   - Columns       : ['EAF', 'BETA', 'SE', 'P']
Sat Feb  3 13:29:29 2024   - Output formats: ['{:.4g}', '{:.4f}', '{:.4f}', '{:.4e}']
Sat Feb  3 13:29:29 2024  -Start outputting sumstats in bed format...
Sat Feb  3 13:29:29 2024  -Number of SNPs : 920
Sat Feb  3 13:29:29 2024  -Number of Insertions : 52
Sat Feb  3 13:29:29 2024  -Number of Deletions : 28
Sat Feb  3 13:29:29 2024  -formatting to 0-based bed-like file...
Sat Feb  3 13:29:29 2024  -format description: https://genome.ucsc.edu/FAQ/FAQformat.html#format1
Sat Feb  3 13:29:29 2024  -Adjusting positions in format-specific manner..
Sat Feb  3 13:29:29 2024  -Output columns: CHR,START,END,NEA/EA,STRAND,SNPID
Sat Feb  3 13:29:29 2024  -Writing sumstats to: ./mysumstats.bed...
Sat Feb  3 13:29:29 2024  -Saving log file to: ./mysumstats.bed.log
Sat Feb  3 13:29:29 2024 Finished outputting successfully!

In [13]:

            
                Copied!
                
!cat mysumstats.bed | head
!cat mysumstats.bed | head

1	2005485	2005486	T/C	+	1:2005486_C_T
1	2247939	2247939	-/AGG	+	1:2247939_AAGG_A
1	3741852	3741853	A/G	+	1:3741853_G_A
1	5017525	5017526	A/G	+	1:5017526_G_A
1	5843474	5843475	T/C	+	1:5843475_C_T
1	9405102	9405103	T/C	+	1:9405103_T_C
1	9443410	9443411	A/G	+	1:9443411_G_A
1	12866347	12866348	C/G	+	1:12866348_G_C
1	14466315	14466316	A/G	+	1:14466316_A_G
1	14900418	14900419	A/G	+	1:14900419_A_G

convert to vep default format¶

In [14]:

            
                Copied!
                
mysumstats.to_format("./mysumstats",fmt="vep")
mysumstats.to_format("./mysumstats",fmt="vep")

Sat Feb  3 13:29:29 2024 Start to convert the output sumstats in:  vep  format
Sat Feb  3 13:29:29 2024  -Formatting statistics ...
Sat Feb  3 13:29:29 2024  -Float statistics formats:
Sat Feb  3 13:29:29 2024   - Columns       : ['EAF', 'BETA', 'SE', 'P']
Sat Feb  3 13:29:29 2024   - Output formats: ['{:.4g}', '{:.4f}', '{:.4f}', '{:.4e}']
Sat Feb  3 13:29:29 2024  -Start outputting sumstats in vep format...
Sat Feb  3 13:29:29 2024  -Number of SNPs : 920
Sat Feb  3 13:29:29 2024  -Number of Insertions : 52
Sat Feb  3 13:29:29 2024  -Number of Deletions : 28
Sat Feb  3 13:29:29 2024  -formatting to 1-based bed-like file (for vep)...
Sat Feb  3 13:29:29 2024  -format description: http://asia.ensembl.org/info/docs/tools/vep/vep_formats.html
Sat Feb  3 13:29:29 2024  -Adjusting positions in format-specific manner..
Sat Feb  3 13:29:29 2024  -Output columns: CHR,START,END,NEA/EA,STRAND,SNPID
Sat Feb  3 13:29:29 2024  -Writing sumstats to: ./mysumstats.vep...
Sat Feb  3 13:29:29 2024  -Saving log file to: ./mysumstats.vep.log
Sat Feb  3 13:29:29 2024 Finished outputting successfully!

In [15]:

            
                Copied!
                
!cat mysumstats.vep | head
!cat mysumstats.vep | head

1	2005486	2005486	T/C	+	1:2005486_C_T
1	2247940	2247939	-/AGG	+	1:2247939_AAGG_A
1	3741853	3741853	A/G	+	1:3741853_G_A
1	5017526	5017526	A/G	+	1:5017526_G_A
1	5843475	5843475	T/C	+	1:5843475_C_T
1	9405103	9405103	T/C	+	1:9405103_T_C
1	9443411	9443411	A/G	+	1:9443411_G_A
1	12866348	12866348	C/G	+	1:12866348_G_C
1	14466316	14466316	A/G	+	1:14466316_A_G
1	14900419	14900419	A/G	+	1:14900419_A_G

convert to annovar default input format¶

In [16]:

            
                Copied!
                
mysumstats.to_format("./mysumstats",fmt="annovar")
mysumstats.to_format("./mysumstats",fmt="annovar")

Sat Feb  3 13:29:29 2024 Start to convert the output sumstats in:  annovar  format
Sat Feb  3 13:29:29 2024  -Formatting statistics ...
Sat Feb  3 13:29:29 2024  -Float statistics formats:
Sat Feb  3 13:29:29 2024   - Columns       : ['EAF', 'BETA', 'SE', 'P']
Sat Feb  3 13:29:29 2024   - Output formats: ['{:.4g}', '{:.4f}', '{:.4f}', '{:.4e}']
Sat Feb  3 13:29:29 2024  -Start outputting sumstats in annovar format...
Sat Feb  3 13:29:29 2024  -Number of SNPs : 920
Sat Feb  3 13:29:29 2024  -Number of Insertions : 52
Sat Feb  3 13:29:29 2024  -Number of Deletions : 28
Sat Feb  3 13:29:29 2024  -formatting to 1-based bed-like file...
Sat Feb  3 13:29:29 2024  -format description: https://annovar.openbioinformatics.org/en/latest/user-guide/input/
Sat Feb  3 13:29:29 2024  -Adjusting positions in format-specific manner..
Sat Feb  3 13:29:29 2024  -Output columns: CHR,START,END,NEA_out,EA_out,SNPID
Sat Feb  3 13:29:29 2024  -Writing sumstats to: ./mysumstats.annovar...
Sat Feb  3 13:29:29 2024  -Saving log file to: ./mysumstats.annovar.log
Sat Feb  3 13:29:29 2024 Finished outputting successfully!

In [17]:

            
                Copied!
                
!cat mysumstats.annovar | head
!cat mysumstats.annovar | head

1	2005486	2005486	T	C	1:2005486_C_T
1	2247940	2247940	-	AGG	1:2247939_AAGG_A
1	3741853	3741853	A	G	1:3741853_G_A
1	5017526	5017526	A	G	1:5017526_G_A
1	5843475	5843475	T	C	1:5843475_C_T
1	9405103	9405103	T	C	1:9405103_T_C
1	9443411	9443411	A	G	1:9443411_G_A
1	12866348	12866348	C	G	1:12866348_G_C
1	14466316	14466316	A	G	1:14466316_A_G
1	14900419	14900419	A	G	1:14900419_A_G

Filter and then output¶

In [18]:

            
                Copied!
                
mysumstats.filter_value("EAF >0.05 and EAF < 0.95").to_format("./mysumstats_maf005", fmt="ssf", ssfmeta=True, md5sum=True)
mysumstats.filter_value("EAF >0.05 and EAF < 0.95").to_format("./mysumstats_maf005", fmt="ssf", ssfmeta=True, md5sum=True)

Sat Feb  3 13:29:29 2024 Start filtering values by condition: EAF >0.05 and EAF < 0.95
Sat Feb  3 13:29:29 2024  -Removing 483 variants not meeting the conditions: EAF >0.05 and EAF < 0.95
Sat Feb  3 13:29:29 2024 Finished filtering values.
Sat Feb  3 13:29:29 2024 Start to convert the output sumstats in:  ssf  format
Sat Feb  3 13:29:29 2024  -Formatting statistics ...
Sat Feb  3 13:29:29 2024  -Float statistics formats:
Sat Feb  3 13:29:29 2024   - Columns       : ['EAF', 'BETA', 'SE', 'P']
Sat Feb  3 13:29:29 2024   - Output formats: ['{:.4g}', '{:.4f}', '{:.4f}', '{:.4e}']
Sat Feb  3 13:29:29 2024  -Replacing SNPID separator from ":" to "_"...
Sat Feb  3 13:29:29 2024  -Start outputting sumstats in ssf format...
Sat Feb  3 13:29:29 2024  -ssf format will be loaded...
Sat Feb  3 13:29:29 2024  -ssf format meta info:
Sat Feb  3 13:29:29 2024   - format_name  : ssf
Sat Feb  3 13:29:29 2024   - format_source  : https://www.biorxiv.org/content/10.1101/2022.07.15.500230v1.full
Sat Feb  3 13:29:29 2024   - format_cite_name  : GWAS-SSF v0.1
Sat Feb  3 13:29:29 2024   - format_separator  : \t
Sat Feb  3 13:29:29 2024   - format_na  : #NA
Sat Feb  3 13:29:29 2024   - format_col_order  : chromosome,base_pair_location,effect_allele,other_allele,beta,odds_ratio,hazard_ratio,standard_error,effect_allele_frequency,p_value,neg_log_10_p_value,ci_upper,ci_lower,rsid,variant_id,info,ref_allele,n
Sat Feb  3 13:29:29 2024   - format_version  :  20230328
Sat Feb  3 13:29:29 2024  -gwaslab to ssf format dictionary:
Sat Feb  3 13:29:29 2024   - gwaslab keys: SNPID,rsID,CHR,POS,NEA,EA,EAF,N,BETA,SE,P,MLOG10P,INFO,OR,HR,OR_95L,OR_95U
Sat Feb  3 13:29:29 2024   - ssf values: variant_id,rsid,chromosome,base_pair_location,other_allele,effect_allele,effect_allele_frequency,n,beta,standard_error,p_value,neg_log_10_p_value,info,odds_ratio,hazard_ratio,ci_lower,ci_upper
Sat Feb  3 13:29:29 2024  -Output path: ./mysumstats_maf005.ssf.tsv.gz
Sat Feb  3 13:29:29 2024  -Output columns: chromosome,base_pair_location,effect_allele,other_allele,beta,standard_error,effect_allele_frequency,p_value,variant_id,n
Sat Feb  3 13:29:29 2024  -Writing sumstats to: ./mysumstats_maf005.ssf.tsv.gz...
Sat Feb  3 13:29:29 2024  -md5sum hashing for the file: ./mysumstats_maf005.ssf.tsv.gz
Sat Feb  3 13:29:29 2024  -md5sum path: ./mysumstats_maf005.ssf.tsv.gz.md5sum
Sat Feb  3 13:29:29 2024  -md5sum: 0e16770079f2372da46ebc1bb32c188d
Sat Feb  3 13:29:29 2024  -Exporting SSF-style meta data to ./mysumstats_maf005.ssf.tsv-meta.ymal
Sat Feb  3 13:29:29 2024  -Saving log file to: ./mysumstats_maf005.ssf.log
Sat Feb  3 13:29:29 2024 Finished outputting successfully!