MACS3.IO.PeakIO module
Module for PeakIO IO classes.
This code is free software; you can redistribute it and/or modify it under the terms of the BSD License (see the file LICENSE included with the distribution).
- class MACS3.IO.PeakIO.BroadPeakContent(start, end, score, thickStart, thickEnd, blockNum, blockSizes, blockStarts, pileup, pscore, fold_change, qscore, name=b'MACS3')
Bases:
objectContainer for broad peak metadata used in broadPeak format.
-
blockNum:
typedef
-
blockSizes:
bytes
-
blockStarts:
bytes
-
end:
typedef
-
fc:
typedef
-
length:
typedef
-
name:
bytes
-
pileup:
typedef
-
pscore:
typedef
-
qscore:
typedef
-
score:
typedef
-
start:
typedef
-
thickEnd:
bytes
-
thickStart:
bytes
-
blockNum:
- class MACS3.IO.PeakIO.BroadPeakIO
Bases:
objectIO for broad peak information.
- add(chromosome, start, end, score=0.0, thickStart=b'.', thickEnd=b'.', blockNum=0, blockSizes=b'.', blockStarts=b'.', pileup=0, pscore=0, fold_change=0, qscore=0, name=b'NA')
Append a
BroadPeakContentrecord.- Parameters:
chromosome (
bytes) – Chromosome name for the region.start (
typedef) – 0-based inclusive start coordinate.end (
typedef) – 0-based exclusive end coordinate.score (
typedef) – Average score across blocks.thickStart (
bytes) – Start of the high-enrichment segment orb'.'.thickEnd (
bytes) – End of the high-enrichment segment orb'.'.blockNum (
typedef) – Number of sub-blocks composing the region.blockSizes (
bytes) – Comma-separated block sizes as bytes.blockStarts (
bytes) – Comma-separated block starts as bytes.pileup (
typedef) – Median pileup within the region.pscore (
typedef) – Median-log10(pvalue).fold_change (
typedef) – Median fold-change value.qscore (
typedef) – Median-log10(qvalue).name (
bytes) – Optional region identifier.
- filter_fc(fc_low, fc_up=-1)
Filter broad peaks by fold-change range.
- Parameters:
fc_low (
float) – Inclusive lower bound on fold change.fc_up (
float) – Exclusive upper bound; ignored when negative.
- filter_pscore(pscore_cut)
Retain broad peaks with
-log10(pvalue)≥pscore_cut.
- filter_qscore(qscore_cut)
Retain broad peaks with
-log10(qvalue)≥qscore_cut.
- peaks = None
- total()
Return the total number of broad peaks currently stored.
- write_to_Bed12(fhd, name_prefix=b'peak_', name=b'peak', description=b'%s', score_column='score', trackline=True)
Write broad peaks in BED12 format.
- Parameters:
fhd – Writable file-like object.
name_prefix (
bytes) – Template used to construct peak identifiers.name (
bytes) – Dataset label interpolated intoname_prefix.description (
bytes) – Track description for the optional header.score_column (
str) – Peak attribute mapped to the score column.trackline (
_fake_callable) – Whether to emit a UCSCtrackheader.
- write_to_broadPeak(fhd, name_prefix=b'peak_', name=b'peak', description=b'%s', score_column='score', trackline=True)
Write broad peaks in the ENCODE broadPeak (BED6+3) format.
- Parameters:
fhd – Writable file-like object.
name_prefix (
bytes) – Template used to construct peak identifiers.name (
bytes) – Dataset label interpolated intoname_prefix.description (
bytes) – Track description for the optional header.score_column (
str) – Peak attribute mapped to the score column.trackline (
_fake_callable) – Whether to emit a UCSCtrackheader.
- write_to_gappedPeak(fhd, name_prefix=b'peak_', name=b'peak', description=b'%s', score_column='score', trackline=True)
Write broad peaks in gappedPeak (BED12+3) format.
- Parameters:
fhd – Writable file-like object.
name_prefix (
bytes) – Template used to construct peak identifiers.name (
bytes) – Dataset label interpolated intoname_prefix.description (
bytes) – Track description for the optional header.score_column (
str) – Peak attribute mapped to the score column.trackline (
_fake_callable) – Whether to emit a UCSCtrackheader.
- write_to_xls(ofhd, name_prefix=b'%s_peak_', name=b'MACS')
Export broad peaks to a tab-delimited
.xlstext file.- Parameters:
ofhd – Writable file-like object.
name_prefix (
bytes) – Template used to build peak identifiers.name (
bytes) – Dataset label interpolated intoname_prefix.
- class MACS3.IO.PeakIO.PeakContent(chrom, start, end, summit, peak_score, pileup, pscore, fold_change, qscore, name=b'')
Bases:
objectRepresent a narrow peak and its derived statistics.
-
chrom:
bytes
-
end:
typedef
-
fc:
typedef
-
length:
typedef
-
name:
bytes
-
pileup:
typedef
-
pscore:
typedef
-
qscore:
typedef
-
score:
typedef
-
start:
typedef
-
summit:
typedef
-
chrom:
- class MACS3.IO.PeakIO.PeakIO(name=b'MACS3')
Bases:
objectManage in-memory collections of narrow peak intervals.
- CO_sorted = None
- add(chromosome, start, end, summit=0, peak_score=0, pileup=0, pscore=0, fold_change=0, qscore=0, name=b'')
Add a peak described by raw coordinates and scores.
- Parameters:
chromosome (
bytes) – Chromosome name for the peak.start (
typedef) – 0-based inclusive start coordinate.end (
typedef) – 0-based exclusive end coordinate.summit (
typedef) – 0-based summit position.peak_score (
typedef) – Reported peak score.pileup (
typedef) – Tag pileup at the summit.pscore (
typedef) –-log10(pvalue)score.fold_change (
typedef) – Fold enrichment relative to control.qscore (
typedef) –-log10(qvalue)score.name (
bytes) – Optional peak identifier.
- add_PeakContent(chromosome, peakcontent)
Extend the collection with an existing
PeakContent.- Parameters:
chromosome (
bytes) – Chromosome name under which to store the peak.peakcontent (
PeakContent) – Peak record to append.
- exclude(peaksio2)
Remove peaks that overlap any entry in
peaksio2.- Parameters:
peaksio2 (
object) – AnotherPeakIOinstance providing exclusion regions.
- filter_fc(fc_low, fc_up=0)
Filter peaks by fold-change range.
- Parameters:
fc_low (
typedef) – Inclusive lower bound on fold change.fc_up (
typedef) – Exclusive upper bound; ignored if<= 0.
- filter_pscore(pscore_cut)
Filter peaks by minimum
-log10(pvalue).- Parameters:
pscore_cut (
typedef) – Lower bound (inclusive) for-log10(pvalue).
- filter_qscore(qscore_cut)
Filter peaks by minimum
-log10(qvalue).- Parameters:
qscore_cut (
typedef) – Lower bound (inclusive) for-log10(qvalue).
- filter_score(lower_score, upper_score=0)
Filter peaks by their primary score range.
- Parameters:
lower_score (
typedef) – Inclusive lower bound on score.upper_score (
typedef) – Exclusive upper bound; if<= 0the bound is ignored.
- get_chr_names()
Return the chromosome names represented in the collection.
- Returns:
Unique chromosome names.
- Return type:
set
- get_data_from_chrom(chrom)
Return peaks for
chrom, initialising storage if needed.- Parameters:
chrom (
bytes) – Chromosome name to query.- Returns:
Peaks associated with
chrom.- Return type:
list
- name = None
- peaks = None
- randomly_pick(n, seed=12345)
Return a new
PeakIOcontainingnrandomly sampled peaks.- Parameters:
n (
typedef) – Number of peaks to sample.seed (
typedef) – RNG seed to ensure reproducibility.
- Returns:
Fresh instance populated with sampled peaks.
- Return type:
- read_from_xls(ofhd)
Load peak records from a MACS3
.xlstab-delimited report.- Parameters:
ofhd – Readable file-like object positioned at the beginning of the report.
- sort()
Sort peaks on each chromosome by ascending start position.
- to_summits_bed()
Write peak summits in BED5 format to
stdout.Each summit is emitted as a one-base interval with the selected score column.
- tobed()
Write peaks in BED5 format to
stdout.The five columns correspond to chromosome, start, end, name, and the attribute selected by
score_column.
- total = None
- write_to_bed(fhd, name_prefix=b'%s_peak_', name=b'MACS', description=b'%s', score_column='score', trackline=True)
Write peaks to a file handle in BED5 format.
- Parameters:
fhd – Writable file-like object.
name_prefix (
bytes) – Template used to build peak names.name (
bytes) – Dataset label interpolated intoname_prefix.description (
bytes) – Track description for optional header line.score_column (
str) – Peak attribute to emit as the score field.trackline (
_fake_callable) – Whether to emit a UCSCtrackheader line.
- write_to_narrowPeak(fhd, name_prefix=b'%s_peak_', name=b'MACS', score_column='score', trackline=False)
Write peaks in the ENCODE narrowPeak (BED6+4) format.
- Parameters:
fhd – Writable file-like object.
name_prefix (
bytes) – Template used to construct peak identifiers.name (
bytes) – Dataset label interpolated intoname_prefix.score_column (
str) – Peak attribute mapped to the narrowPeak score field.trackline (
_fake_callable) – Whether to emit a UCSCtrackheader.
- write_to_summit_bed(fhd, name_prefix=b'%s_peak_', name=b'MACS', description=b'%s', score_column='score', trackline=False)
Write peak summits to a file handle in BED5 format.
- Parameters:
fhd – Writable file-like object.
name_prefix (
bytes) – Template used to build summit names.name (
bytes) – Dataset label interpolated intoname_prefix.description (
bytes) – Track description for optional header line.score_column (
str) – Peak attribute to emit as the score field.trackline (
_fake_callable) – Whether to emit a UCSCtrackheader line.
- write_to_xls(ofhd, name_prefix=b'%s_peak_', name=b'MACS')
Export narrow peaks to a tab-delimited
.xlstext file.- Parameters:
ofhd – Writable file-like object.
name_prefix (
bytes) – Template used to build peak identifiers.name (
bytes) – Dataset label interpolated intoname_prefix.
- class MACS3.IO.PeakIO.RegionIO
Bases:
objectHelper for storing and manipulating simple genomic regions.
- add_loc(chrom, start, end)
Append a new
(start, end)interval forchrom.
- get_chr_names()
Return chromosome names present in the region set.
- Return type:
set
- merge_overlap()
Merge overlapping intervals within each chromosome.
-
regions:
dict
- sort()
Sort regions for each chromosome by their start coordinate.
- write_to_bed(fhd)
Emit regions in BED format to the provided file-like object.
- MACS3.IO.PeakIO.bool(*args, **kwargs)
- MACS3.IO.PeakIO.subpeak_letters(i)
Return the alphabetical label for a zero-based subpeak index.
- Parameters:
i (
typedef) – Zero-based subpeak index.- Returns:
Alphabetical label sequence (
a,b, …,aa).- Return type:
str