predictd
Overview
The predictd
command is part of the MACS3 suite of tools and is used
to predict the expected DNA fragment size from alignment files. It
uses the cross-correlation method to find the best shift to correlate
the cutting ends on plus and minus strands.
Detailed Description
The predictd
command takes an input bedGraph file and predicts d
or fragment size from alignment results. In case of paired-end data,
it will report the average insertion/fragment size from all
pairs. Note there will be no step for duplicate reads filtering or
sequencing depth scaling, so you may need to do certain
pre/post-processing, such as using filterdup
or randsample
command.
If the alignment file is a single-end file, a model file (from
--rfile
) will be saved which can be used to visualize the model in
PDF. And the command line output will tell the predicted d size in
the line of predicted fragment length is
and alternative d sizes
in the line of alternative fragment length(s) may be
.
If the alignment file is a paired-end file (-f BAMPE
or -f BEDPE
),
the model file won’t be generated. Instead, you can find the average
fragment size in the command line output in the line of: Average insertion length of all pairs is
.
Command Line Options
Here is a brief overview of the predictd
options:
-i
or--ifile
: ChIP-seq alignment file. If multiple files are given as ‘-t A B C’, then they will all be read and combined. REQUIRED.-f
or--format
: Format of the tag file.AUTO
: MACS3 will pick a format from “AUTO”, “BED”, “ELAND”, “ELANDMULTI”, “ELANDEXPORT”, “SAM”, “BAM”, “BOWTIE”, “BAMPE”, and “BEDPE”. However, if you want to decide the average insertion size/fragment size from PE data such as BEDPE or BAMPE, please specify the format as BAMPE or BEDPE since MACS3 won’t automatically recognize these two formats with -f AUTO. Please be aware that in PE mode, -g, -s, –bw, –d-min, -m, and –rfile have NO effect. DEFAULT: “AUTO”
-g
or--gsizeE
: Please checkcallpeak
for detail. DEFAULT:hs-s
or--tsize
: Tag size. This will override the auto-detected tag size. DEFAULT: Not set--bw
: Bandwidth for picking regions to compute the fragment size. This value is only used while building the shifting model. DEFAULT: 300--d-min
: Minimum fragment size in base pairs. Any predicted fragment size less than this will be excluded. DEFAULT: 20-m
or--mfoldD
: Select the regions within MFOLD range of high-confidence enrichment ratio against background to build the model. Fold-enrichment in regions must be lower than the upper limit and higher than the lower limit. Use as “-m 10 30”. DEFAULT: 5 50--outdir
: If specified, all output files will be written to that directory. Default: the current working directory--rfile
: PREFIX of the filename of the R script for drawing the X-correlation figure. DEFAULT: ‘predictd_model.R’ and the R file will be predicted_model.R--buffer-size
: Buffer size for incrementally increasing the internal array size to store read alignment information. In most cases, you don’t have to change this parameter. However, if there is a large number of chromosomes/contigs/scaffolds in your alignment, it’s recommended to specify a smaller buffer size in order to decrease memory usage (but it will take longer time to read alignment files). Minimum memory requested for reading an alignment file is about # of CHROMOSOME * BUFFER_SIZE * 8 Bytes. DEFAULT: 100000--verbose
: Set the verbose level of runtime messages. 0: only show critical messages, 1: show additional warning messages, 2: show process information, 3: show debug messages. DEFAULT: 2
Example Usage
Here is an example of how to use the predictd
command:
macs3 predictd -i input.bedGraph --rfile model.R
Then you can use R to make a figure for the model:
Rscript model.R