MACS3.Signal.FixWidthTrack module
Module for FWTrack classes.
This code is free software; you can redistribute it and/or modify it under the terms of the BSD License (see the file LICENSE included with the distribution).
- class MACS3.Signal.FixWidthTrack.FWTrack(fw=0, anno='', buffer_size=100000)
Bases:
objectFixed-width fragment track grouped by chromosome.
Stores plus- and minus-strand 5’ cut positions in numpy arrays and exposes utilities for sorting, filtering, sampling, and pileup generation.
- add_loc(chromosome, fiveendpos, strand)
Append a 5’ cut position to the track.
- Parameters:
chromosome (bytes) – Chromosome name (as bytes) that owns the cut.
fiveendpos (int) – Zero-based 5’ coordinate of the cut site.
strand (int) – Strand flag where
0denotes plus and1denotes minus.
Notes
Positions are stored in strand-specific numpy arrays keyed by chromosome, and the strand pointer is advanced as new positions are appended.
- annotation = None
-
buf_size:
dict
- buffer_size = None
- compute_region_tags_from_peaks(peaks, func, window_size=100, cutoff=5.0)
Apply a summary function to tags collected around peak regions.
- Parameters:
peaks (MACS3.IO.PeakIO.PeakIO) – Peak container providing genomic intervals and metadata.
func (callable) – Callback invoked as
func(chrom, plus, minus, startpos, endpos, ...)for each peak. The callable must acceptwindow_sizeandcutoffkeyword arguments.window_size (int, optional) – Half-window size added on each side of every peak when collecting tags.
cutoff (float, optional) – Additional threshold passed to
func.
- Returns:
Results returned by
funcfor each processed peak.- Return type:
list
Notes
Both the track and the
peaksobject are sorted before iteration, and per-chromosome state is reused to avoid rescanning arrays.
- destroy()
Release numpy buffers held by the track.
All per-chromosome arrays are resized to zero so the memory footprint returns to the allocator, and the track is marked as destroyed.
- extract_region_tags(chromosome, startpos, endpos)
Collect positions within a genomic window for both strands.
- Parameters:
chromosome (bytes) – Chromosome identifier to query.
startpos (int) – Inclusive start coordinate of the window.
endpos (int) – Inclusive end coordinate of the window.
- Returns:
Pair of numpy arrays
(plus, minus)containing positions inside the requested window.- Return type:
tuple[numpy.ndarray, numpy.ndarray]
Notes
The track is sorted on demand before performing the windowed lookup.
- filter_dup(maxnum=-1)
Limit duplicate 5’ positions to a maximum count per strand.
- Parameters:
maxnum (int, optional) – Maximum number of occurrences allowed per coordinate. A negative value disables duplicate filtering.
- Returns:
Total number of retained positions across both strands after filtering.
- Return type:
int
Notes
The track must be sorted before filtering. Coordinates exceeding
maxnumare discarded, pointers are updated, andtotal/lengthare recomputed.
- finalize()
Shrink arrays and sort per-strand coordinates in place.
Each chromosome’s plus- and minus-strand arrays are resized to the observed counts, sorted ascending, and aggregate counters such as
totalandlengthare refreshed. Call this after loading data.
- fw = None
- get_chr_names()
Return a sorted set of chromosome names stored in the track.
- Returns:
Sorted chromosome names (bytes) that currently have positions.
- Return type:
set
- get_locations_by_chr(chromosome)
Return the strand-specific arrays for a chromosome.
- Parameters:
chromosome (bytes) – Chromosome name, provided as bytes.
- Returns:
Pair of numpy arrays
(plus, minus)containing 5’ positions.- Return type:
tuple[numpy.ndarray, numpy.ndarray]
- Raises:
Exception – If the chromosome is not present in the track.
- get_rlengths()
Return the reference chromosome lengths associated with the track.
- Returns:
Mapping from chromosome name (bytes) to reference length. Chromosomes without a recorded length default to
INT_MAX.- Return type:
dict
-
is_destroyed:
_fake_callable
-
is_sorted:
_fake_callable
- length = None
-
locations:
dict
- pileup_a_chromosome(chrom, d, scale_factor=1.0, baseline_value=0.0, directional=True, end_shift=0)
Compute a coverage pileup for a single chromosome.
- Parameters:
chrom (bytes) – Chromosome name to pile up.
d (int) – Extension length applied in the 3’ direction unless
directionalisFalse.scale_factor (float, optional) – Value used to scale the resulting coverage.
baseline_value (float, optional) – Minimum value enforced on the coverage array.
directional (bool, optional) – If
False, extend cuts symmetrically to both sides byd / 2.end_shift (int, optional) – Shift applied to the 5’ cuts before extension; positive values move toward the 3’ direction.
- Returns:
Two-element list
[positions, values]with numpy arrays describing the pileup breakpoints and scaled coverage.- Return type:
list
- pileup_a_chromosome_c(chrom, ds, scale_factor_s, baseline_value=0.0, directional=True, end_shift=0)
Compute a control pileup using multiple extension lengths.
- Parameters:
chrom (bytes) – Chromosome name to pile up.
ds (list[int]) – Extension lengths used to build individual pileups.
scale_factor_s (list[float]) – Scale factors paired with each entry in
ds.baseline_value (float, optional) – Minimum value enforced on the coverage array.
directional (bool, optional) – If
False, extend cuts symmetrically to both sides byd / 2.end_shift (int, optional) – Shift applied to the 5’ cuts before extension; positive values move toward the 3’ direction.
- Returns:
Two-element list
[positions, values]representing the merged pileup where the maximum value is taken across the supplied extensions.- Return type:
list
-
pointer:
dict
- print_to_bed(fhd=None)
Stream the track as BED records.
- Parameters:
fhd (io.IOBase, optional) – Writable file-like object. Defaults to
sys.stdout.
Notes
Emits one record per stored position with fixed-width intervals derived from
fwand strand-specific orientation.
-
rlengths:
dict
- sample_num(samplesize, seed=-1)
Down-sample positions in place so the total approximates
samplesize.- Parameters:
samplesize (int) – Target number of positions across both strands.
seed (int, optional) – Seed forwarded to
sample_percent().
Notes
The method converts
samplesizeinto a sampling fraction using the currenttotaland reusessample_percent().
- sample_percent(percent, seed=-1)
Down-sample positions in place by a fixed percentage.
- Parameters:
percent (float) – Fraction of positions to keep per strand between 0 and 1 (inclusive).
seed (int, optional) – Seed forwarded to NumPy’s RNG; a negative value uses global state.
Notes
Sampling is performed independently for plus and minus strands by shuffling each array, resizing to the requested fraction, and restoring sort order. Aggregate counters
totalandlengthare refreshed.
- set_rlengths(rlengths)
Attach reference chromosome lengths to the track.
- Parameters:
rlengths (dict) – Mapping from chromosome name (bytes) to reference length.
- Returns:
True when the length mapping has been updated.
- Return type:
bool
Notes
Any chromosome stored in the track but missing from
rlengthsis assignedINT_MAXso downstream bounds checks can succeed.
- sort()
Sort per-strand coordinate arrays for every chromosome.
Positions are ordered ascending on each strand and the
is_sortedflag is set toTrueonce sorting completes.
- total = None
- MACS3.Signal.FixWidthTrack.bool(*args, **kwargs)
- MACS3.Signal.FixWidthTrack.left_forward(data, pos, window_size)
- Return type:
typedef
- MACS3.Signal.FixWidthTrack.left_sum(data, pos, width)
- Return type:
typedef
- MACS3.Signal.FixWidthTrack.right_forward(data, pos, window_size)
- Return type:
typedef
- MACS3.Signal.FixWidthTrack.right_sum(data, pos, width)
- Return type:
typedef