MACS3.Signal.Pileup module

Module Description: For pileup functions.

This code is free software; you can redistribute it and/or modify it under the terms of the BSD License (see the file LICENSE included with the distribution).

MACS3.Signal.Pileup.clean_up_ndarray(x)

Clean up numpy array in two steps

MACS3.Signal.Pileup.fix_coordinates(poss, rlength)

Fix the coordinates.

Return type:

_fake_callable

MACS3.Signal.Pileup.free(*args, **kwargs)
MACS3.Signal.Pileup.mean(a, b)
Return type:

float

MACS3.Signal.Pileup.naive_call_peaks(pv_array, min_v, max_v=1e+30, max_gap=50, min_length=200)

Identify peak summits from a [p, v] array.

Parameters:
  • pv_array (list) – [p, v] array as produced by pileup functions.

  • min_v (typedef) – Minimum value to be considered part of a peak.

  • max_v (typedef) – Maximum allowed summit height. Default is 1e30.

  • max_gap (typedef) – Maximum gap (in bp) allowed between adjacent peak segments to be merged. Default is 50.

  • min_length (typedef) – Minimum peak length (in bp). Default is 200.

Returns:

List of (summit, height) tuples.

Return type:

list

MACS3.Signal.Pileup.naive_quick_pileup(sorted_poss, extension)

Simple pileup by extending each tag symmetrically.

Each input position is extended left and right by extension. The input must be sorted; no sorting or validation is performed.

Parameters:
  • sorted_poss (_fake_callable) – Sorted tag positions.

  • extension (int) – Extension size applied to both sides.

Returns:

[p, v] where p is an int32 numpy array of end positions and v is a float32 numpy array of values.

Return type:

list

MACS3.Signal.Pileup.over_two_pv_array(pv_array1, pv_array2, func='max')

Merge two [p, v] arrays with a pointwise reducer.

Parameters:
  • pv_array1 (list) – First [p, v] array, same as output from quick_pileup function.

  • pv_array2 (list) – Second [p, v] array, same as output from quick_pileup function.

  • func (str) – Reducer for overlapping regions. One of "max", "min", or "mean". Default is "max".

Returns:

Merged [p, v] array with the reducer applied to overlap regions.

Return type:

list

MACS3.Signal.Pileup.pileup_and_write_pe(petrackI, output_filename, scale_factor=1, baseline_value=0.0)

Pile up a paired-end track and write a bedGraph file.

This is a thin Cython wrapper that computes pileup using the C-accelerated routines in cPosValCalculation.

Parameters:
  • petrackI – Paired-end track object (PETrackI). Must provide get_rlengths() and get_locations_by_chr(chrom).

  • output_filename (bytes) – Output bedGraph path as bytes.

  • scale_factor (float) – Scalar applied to pileup values. Default is 1.

  • baseline_value (float) – Minimum output value per bin. Default is 0.0.

Return type:

None

Notes

This function is currently only used by the macs3 pileup command.

Examples

pileup_and_write_pe(
    petrackI,
    b"out.bedGraph",
    scale_factor=1.0,
    baseline_value=0.0,
)
MACS3.Signal.Pileup.pileup_and_write_se(trackI, output_filename, d, scale_factor, baseline_value=0.0, directional=True, halfextension=True)

Pile up a single-end track and write a bedGraph file.

This is a thin Cython wrapper that computes pileup using the C-accelerated routines in cPosValCalculation.

Parameters:
  • trackI – Single-end track object (FWTrackI).

  • output_filename (bytes) – Output bedGraph path.

  • d (typedef) – Fragment length estimate.

  • scale_factor (typedef) – Scalar applied to pileup values.

  • baseline_value (float) – Minimum output value per bin. Default is 0.0.

  • directional (_fake_callable) – If True, extend reads only to 3’ direction. If False, extend to both sides. Default is True.

  • halfextension (_fake_callable) – If True, compute shift values from d using the half- extension scheme. Default is True.

Return type:

None

Notes

This function is currently only used by the macs3 pileup command.

MACS3.Signal.Pileup.quick_pileup(start_poss, end_poss, scale_factor, baseline_value)

Compute pileup from fragment start/end positions.

The algorithm is a fast sweep over sorted start and end positions (Jie Wang). It returns a [p, v] array compatible with bedGraph semantics.

Parameters:
  • start_poss (_fake_callable) – Sorted fragment start positions.

  • end_poss (_fake_callable) – Sorted fragment end positions.

  • scale_factor (typedef) – Scalar applied to pileup values.

  • baseline_value (typedef) – Minimum output value per bin.

Returns:

[p, v] where p is an int32 numpy array of end positions and v is a float32 numpy array of values.

Return type:

list

Examples

starts = np.array([10, 50, 100], dtype="i4")
ends = np.array([40, 90, 140], dtype="i4")
p, v = quick_pileup(starts, ends, scale_factor=1.0, baseline_value=0.0)
MACS3.Signal.Pileup.se_all_in_one_pileup(plus_tags, minus_tags, five_shift, three_shift, rlength, scale_factor, baseline_value)

Return pileup given 5’ end of fragment at plus or minus strand separately, and given shift at both direction to recover a fragment. This function is for single end sequencing library only. Please directly use ‘quick_pileup’ function for Pair-end library.

It contains a super-fast and simple algorithm proposed by Jie Wang. It will take sorted start positions and end positions, then compute pileup values.

It will return a pileup result in similar structure as bedGraph. There are two python arrays:

[end positions, values] or ‘[p,v] array’ in other description for functions within MACS3.

Two arrays have the same length and can be matched by index. End position at index x (p[x]) record continuous value of v[x] from p[x-1] to p[x]. :type plus_tags: _fake_callable :param plus_tags: Sorted 5’ end positions on the plus strand. :type minus_tags: _fake_callable :param minus_tags: Sorted 5’ end positions on the minus strand. :type five_shift: typedef :param five_shift: Shift applied toward the 5’ direction. :type three_shift: typedef :param three_shift: Shift applied toward the 3’ direction. :type rlength: typedef :param rlength: Chromosome length; coordinates are clipped to [0, rlength]. :type scale_factor: typedef :param scale_factor: Scalar applied to pileup values. :type baseline_value: typedef :param baseline_value: Minimum output value per bin.

Returns:

[p, v] where p is an int32 numpy array of end positions and v is a float32 numpy array of values.

Return type:

list

Examples

plus = np.array([10, 50, 100], dtype="i4")
minus = np.array([30, 80], dtype="i4")
p, v = se_all_in_one_pileup(
    plus,
minus,
five_shift=-25,
three_shift=75,
rlength=1000,
scale_factor=1.0,
baseline_value=0.0,
)