MACS3.Signal.ScoreTrack module
Scoring utilities for MACS3 signal tracks and peak callers.
This code is free software; you can redistribute it and/or modify it under the terms of the BSD License (see the file LICENSE included with the distribution).
- class MACS3.Signal.ScoreTrack.ScoreTrackII(treat_depth, ctrl_depth, pseudocount=1.0)
Bases:
objectContainer for treatment/control pileups and derived score tracks.
- add(chromosome, endpos, chip, control)
Append treatment/control pileup ending at
endposforchromosome.
- add_chromosome(chrom, chrom_max_len)
Allocate arrays for
chromwith capacitychrom_max_len.
- call_broadpeaks(lvl1_cutoff=5.0, lvl2_cutoff=1.0, min_length=200, lvl1_max_gap=50, lvl2_max_gap=400)
Return broad peaks constructed from high- and low-cutoff segments.
- Parameters:
lvl1_cutoff (
typedef) – Threshold for core enriched segments.lvl2_cutoff (
typedef) – Threshold for linking segments.min_length (
typedef) – Minimum peak length to report.lvl1_max_gap (
typedef) – Maximum gap when merging level-1 segments.lvl2_max_gap (
typedef) – Maximum allowed length for linking segments.
- call_peaks(cutoff=5.0, min_length=200, max_gap=50, call_summits=False)
Return peaks where scores remain above
cutoff.- Parameters:
cutoff (
typedef) – Minimum score threshold (e.g.,-log10 p).min_length (
typedef) – Minimum peak length in bases.max_gap (
typedef) – Maximum distance between merged segments.call_summits (
_fake_callable) – Whether to report all local maxima within peaks.
- change_normalization_method(normalization_method)
Change/set normalization method. However, I do not recommend change this back and forward, since some precision issue will happen – I only keep two digits.
- normalization_method: T: scale to depth of treatment;
C: scale to depth of control; M: scale to depth of 1 million; N: not set/ raw pileup
- change_score_method(scoring_method)
- scoring_method: p: -log10 pvalue;
q: -log10 qvalue; l: log10 likelihood ratio (minus for depletion) s: symmetric log10 likelihood ratio (for comparing two
ChIPs)
f: log10 fold enrichment F: linear fold enrichment d: subtraction M: maximum m: fragment pileup per million reads
- compute_SPMR()
Populate scores with treatment pileup per million reads.
- compute_foldenrichment()
Calculate linear scale fold enrichment (with 1 pseudocount).
- compute_likelihood()
Calculate log10 likelihood.
- compute_logFE()
Calculate log10 fold enrichment (with 1 pseudocount).
- compute_max()
Populate scores with the element-wise maximum of treatment and control.
- compute_pvalue()
Compute -log_{10}(pvalue)
- compute_qvalue()
Compute -log_{10}(qvalue)
- compute_subtraction()
Populate scores with treatment minus control pileup.
- compute_sym_likelihood()
Calculate symmetric log10 likelihood.
-
ctrl_edm:
typedef
-
cutoff:
typedef
- cutoff_analysis(max_gap=50, min_length=200, steps=100, min_score=0, max_score=1000)
Summarise peak metrics across a range of score thresholds.
- Parameters:
max_gap (
typedef) – Maximum distance between merged regions.min_length (
typedef) – Minimum peak length to keep.steps (
typedef) – Number of cutoff increments between the observed minimum and maximum scores.min_score (
typedef) – Lower bound for the cutoff sweep.max_score (
typedef) – Upper bound for the cutoff sweep.
- Returns:
Tab-delimited report of peak counts and lengths per cutoff.
- Return type:
str
-
data:
dict
-
datalength:
dict
- enable_trackline()
Enable UCSC track line output when exporting bedGraphs.
- finalize()
Trim per-chromosome arrays to their populated length.
- get_chr_names()
Return all the chromosome names stored.
- get_data_by_chr(chromosome)
Return
(positions, treatment, control, score)arrays forchromosome.
- make_pq_table()
Make pvalue-qvalue table.
Step1: get all pvalue and length of block with this pvalue Step2: Sort them Step3: Apply AFDR method to adjust pvalue and get qvalue for
each pvalue
Return a dictionary of {-log10pvalue:(-log10qvalue,rank,basepairs)} relationships.
-
normalization_method:
typedef
- normalize(treat_scale, control_scale)
Scale treatment and control pileups in-place by the given factors.
-
pseudocount:
typedef
-
pvalue_stat:
dict
-
scoring_method:
typedef
- set_pseudocount(pseudocount)
Update the pseudocount used when computing score metrics.
- total()
Return the number of regions in this object.
- Return type:
typedef
-
trackline:
_fake_callable
-
treat_edm:
typedef
- write_bedGraph(fhd, name, description, column=3)
Write all data to fhd in bedGraph Format.
fhd: a filehandler to save bedGraph.
name/description: the name and description in track line.
colname: can be 1: chip, 2: control, 3: score
- class MACS3.Signal.ScoreTrack.TwoConditionScores(t1bdg, c1bdg, t2bdg, c2bdg, cond1_factor=1.0, cond2_factor=1.0, pseudocount=0.01, proportion_background_empirical_distribution=0.99999)
Bases:
objectClass for saving two condition comparison scores.
- add(chromosome, endpos, t1, c1, t2, c2)
Take chr-endpos-sample1-control1-sample2-control2 and compute logLR for t1 vs c1, t2 vs c2, and t1 vs t2, then save values.
chromosome: chromosome name in string endpos : end position of each interval in integer t1 : Sample 1 ChIP pileup value of each interval in float c1 : Sample 1 Control pileup value of each interval in float t2 : Sample 2 ChIP pileup value of each interval in float c2 : Sample 2 Control pileup value of each interval in float
Warning Need to add regions continuously.
- add_chromosome(chrom, chrom_max_len)
Allocate storage for
chromwith capacitychrom_max_len.
- build()
Compute scores from 3 types of comparisons and store them in self.data.
- build_chromosome(chrname, cond1_treat_ps, cond1_control_ps, cond2_treat_ps, cond2_control_ps, cond1_treat_vs, cond1_control_vs, cond2_treat_vs, cond2_control_vs)
Internal function to calculate scores for three types of comparisons.
cond1_treat_ps, cond1_control_ps: position of treat and control of condition 1 cond2_treat_ps, cond2_control_ps: position of treat and control of condition 2 cond1_treat_vs, cond1_control_vs: value of treat and control of condition 1 cond2_treat_vs, cond2_control_vs: value of treat and control of condition 2
-
c1bdg:
object
-
c2bdg:
object
- call_peaks(cutoff=3, min_length=200, max_gap=100, call_summits=False)
This function try to find regions within which, scores are continuously higher than a given cutoff.
For bdgdiff.
This function is NOT using sliding-windows. Instead, any regions in bedGraph above certain cutoff will be detected, then merged if the gap between nearby two regions are below max_gap. After this, peak is reported if its length is above min_length.
cutoff: cutoff of value, default 3. For log10 LR, it means 1000 or -1000. min_length : minimum peak length, default 200. max_gap : maximum gap to merge nearby peaks, default 100. ptrack: an optional track for pileup heights. If it’s not None, use it to find summits. Otherwise, use self/scoreTrack.
- Return type:
tuple
-
cond1_factor:
typedef
-
cond2_factor:
typedef
-
cutoff:
typedef
-
data:
dict
-
datalength:
dict
- finalize()
Adjust array size of each chromosome.
- get_chr_names()
Return all the chromosome names stored.
- get_common_chrs()
Return chromosome names shared across all input bedGraphs.
- Return type:
set
- get_data_by_chr(chromosome)
Return array of counts by chromosome.
The return value is a tuple: ([end pos],[value])
- mean_from_peakcontent(peakcontent)
- Return type:
typedef
-
pseudocount:
typedef
-
pvalue_stat1:
dict
-
pvalue_stat2:
dict
-
pvalue_stat3:
dict
- set_pseudocount(pseudocount)
Update the pseudocount used for differential scoring.
-
t1bdg:
object
-
t2bdg:
object
- total()
Return the number of regions in this object.
- Return type:
typedef
- write_bedGraph(fhd, name, description, column=3)
Write all data to fhd in bedGraph Format.
fhd: a filehandler to save bedGraph.
name/description: the name and description in track line.
colname: can be 1: cond1 chip vs cond1 ctrl, 2: cond2 chip vs cond2 ctrl, 3: cond1 chip vs cond2 chip
- write_matrix(fhd, name, description)
Write all data to fhd into five columns Format:
col1: chr_start_end col2: t1 vs c1 col3: t2 vs c2 col4: t1 vs t2
fhd: a filehandler to save the matrix.
- MACS3.Signal.ScoreTrack.bool(*args, **kwargs)
- MACS3.Signal.ScoreTrack.get_logFE(x, y)
Return
log10fold enrichment (base-10) forxovery.- Return type:
typedef
- MACS3.Signal.ScoreTrack.get_pscore(observed, expectation)
Return cached
-log10Poisson tail probability forobserved.- Return type:
typedef
- MACS3.Signal.ScoreTrack.get_subtraction(x, y)
Return the difference
x - y.- Return type:
typedef
- MACS3.Signal.ScoreTrack.int_max(a, b)
Return the larger of
aandb.- Return type:
typedef
- MACS3.Signal.ScoreTrack.int_min(a, b)
Return the smaller of
aandb.- Return type:
typedef
- MACS3.Signal.ScoreTrack.log(*args, **kwargs)
- MACS3.Signal.ScoreTrack.log10(*args, **kwargs)
- MACS3.Signal.ScoreTrack.logLR_asym(x, y)
Return asymmetric
log10likelihood ratio betweenxandy.- Return type:
typedef
- MACS3.Signal.ScoreTrack.logLR_sym(x, y)
Return symmetric
log10likelihood ratio betweenxandy.- Return type:
typedef