VCF format

The VCF format is used as the output from callvar subcommand. The version of format we use in MACS3 is v4.1. Please check the link for detail.

The output from callvar has the following customized fields as defined in the VCF header lines such as:

##fileformat=VCFv4.1
##fileDate=20240514
##source=MACS_V3.0.1
##Program_Args=callvar -b callvar_testing.narrowPeak -t CTCF_PE_ChIP_chr22_50k.bam -c CTCF_PE_CTRL_chr22_50k.bam -o ../temp/test513_run_callvar/PEsample.vcf -Q 20 -D 1 --max-ar 0.95 --top2alleles-mratio 0.8 --top2allele-count 2 -g 0 -G 0  --fermi auto --fermi-overlap 30
##INFO=<ID=M,Number=.,Type=String,Description="MACS Model with minimum BIC value">
##INFO=<ID=MT,Number=.,Type=String,Description="Mutation type: SNV/Insertion/Deletion">
##INFO=<ID=DPT,Number=1,Type=Integer,Description="Depth Treatment: Read depth in ChIP-seq data">
##INFO=<ID=DPC,Number=1,Type=Integer,Description="Depth Control: Read depth in control data">
##INFO=<ID=DP1T,Number=.,Type=String,Description="Read depth of top1 allele in ChIP-seq data">
##INFO=<ID=DP2T,Number=.,Type=String,Description="Read depth of top2 allele in ChIP-seq data">
##INFO=<ID=DP1C,Number=.,Type=String,Description="Read depth of top1 allele in control data">
##INFO=<ID=DP2C,Number=.,Type=String,Description="Read depth of top2 allele in control data">
##INFO=<ID=DBIC,Number=.,Type=Float,Description="Difference of BIC of selected model vs second best alternative model">
##INFO=<ID=BICHOMOMAJOR,Number=1,Type=Integer,Description="BIC of homozygous with major allele model">
##INFO=<ID=BICHOMOMINOR,Number=1,Type=Integer,Description="BIC of homozygous with minor allele model">
##INFO=<ID=BICHETERNOAS,Number=1,Type=Integer,Description="BIC of heterozygous with no allele-specific model">
##INFO=<ID=BICHETERAS,Number=1,Type=Integer,Description="BIC of heterozygous with allele-specific model">
##INFO=<ID=AR,Number=1,Type=Float,Description="Estimated allele ratio of heterozygous with allele-specific model">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read depth after filtering bad reads">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality score">
##FORMAT=<ID=PL,Number=3,Type=Integer,Description="Normalized, Phred-scaled genotype likelihoods for 00, 01, 11 genotype"

The header lines contain the command line options used to generate this output, date of the file, and definitions of the customized fields in ‘INFO’ and ‘FORMAT’/’SAMPLE’ columns of the VCF. Here is an example of the actual data:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE
chr22	17255752	.	A	G	58	.	M=heter_unsure;MT=SNV;DPT=7;DPC=0;DP1T=5A;DP2T=2G;DP1C=0A;DP2C=0G;SB=0,0,5,2;DBIC=23.21;BICHOMOMAJOR=37.77;BICHOMOMINOR=84.27;BICHETERNOAS=13.53;BICHETERAS=14.56;AR=0.71	GT:DP:GQ:PL	0/1:7:58:159,0,58
chr22	17392539	.	G	C	138	.	M=heter_noAS;MT=SNV;DPT=13;DPC=0;DP1T=7C;DP2T=6G;DP1C=0C;DP2C=0G;SB=0,1,7,5;DBIC=61.11;BICHOMOMAJOR=84.75;BICHOMOMINOR=101.33;BICHETERNOAS=23.63;BICHETERAS=26.12;AR=0.54	GT:DP:GQ:PL	0/1:13:138:174,0,13