MACS3.Signal.PileupV2 module

Module Description:

New pileup algorithm based on p-v array idea. It’s modified from original algorithm in MACS2 proposed by Jie Wang. Now we allow different weights for different genomic ranges, and simplify the approach. The basic idea is to remember for each position, how we modify the ‘pileup’ value while walking on the chromosome.

For a genomic range r_i covering genomic positions from s_i to e_i to be piled up, we assign the start position a value w_i, and the end -w_i, so we will use a tuple of position and weight to remember these operations: (s_i, w_i) and (e_i, -w_i). Then all N ranges will be made into an array (2D) of position and weights as:

PV = [ (s_0, w_0), (e_0, -w_0), (s_1, w_1), (e_1, -w_1), … (s_i, w_i),

(e_i, -w_i), …, (s_N, w_N), (e_N, -w_N) ]

Then the array PV will be sorted by the first dimension, aka the position, no matter the position is from start or end positions from ranges.

PV_sorted = [ (p_0, v_0), (p_1, v_1), … , (p_i, v_i), …, (p_{2N}, v_{2N}) ]

The pileup algorithm to produce a bedGraph style pileup (another p-v array as in Pileup.py) can be simply described as:

set the initial pileup z as 0 or a given value, and a start position s as 0, and an end position e as not-defined.

for i from 0 to 2N in PV_sorted:

1: z = z + v_i 2: e = p_i 3: save the pileup from position s to e is z,

in bedGraph style is to only save (e, z)

4: s = e

This code is free software; you can redistribute it and/or modify it under the terms of the BSD License (see the file LICENSE included with the distribution).

MACS3.Signal.PileupV2.clean_up_ndarray(x)

Clean up numpy array in two steps

MACS3.Signal.PileupV2.make_PV_from_LR(LR_array, mapping_func=<function mapping_function_always_1>)

Make sorted PV array from a LR array for certain chromosome in a PETrackI object. The V/weight will be assigned as mapping_func( L, R ) or simply 1 if mapping_func is the default.

LR array is an np.ndarray as with dtype [(‘l’,’i4’),(‘r’,’i4’)] with length of N

PV array is an np.ndarray with dtype=[(‘p’,’i4’),(‘v’,’f4’)] with length of 2N

Return type:

_fake_callable

MACS3.Signal.PileupV2.make_PV_from_LRC(LRC_array, mapping_func=<function mapping_function_always_1>)

Make sorted PV array from a LR array for certain chromosome in a PETrackII object. The V/weight will be assigned as mapping_func( L, R ) or simply 1 if mapping_func is the default.

LRC array is an np.ndarray as with dtype [(‘l’,’i4’),(‘r’,’i4’),(‘c’,’u2’)] with length of N

PV array is an np.ndarray with dtype=[(‘p’,’i4’),(‘v’,’f4’)] with length of 2N

Return type:

_fake_callable

MACS3.Signal.PileupV2.make_PV_from_PN(P_array, N_array, extsize)

Make sorted PV array from two arrays for certain chromosome in a FWTrack object. P_array is for the 5’ end positions in plus strand, and N_array is for minus strand. We don’t support weight in this case since all positions should be extended with a fixed ‘extsize’.

P_array or N_array is an np.ndarray with dtype=’i4’

PV array is an np.ndarray with dtype=[(‘p’,’i4’),(‘v’,’f4’)] with length of 2N

Return type:

_fake_callable

MACS3.Signal.PileupV2.mapping_function_always_1(L, R)
Return type:

typedef

MACS3.Signal.PileupV2.pileup_PV(PV_array)

The pileup algorithm to produce a bedGraph style pileup (another p-v array as in Pileup.py) can be simply described as:

set the initial pileup z as 0 or a given value, and a start position s as 0, and an end position e as not-defined.

Return type:

_fake_callable

for i from 0 to 2N in PV_sorted:

z = z + v_i e = p_i save the pileup from position s to e is z – in bedGraph style is to only save (e, z) s = e

MACS3.Signal.PileupV2.pileup_from_LR(LR_array, mapping_func=<function mapping_function_always_1>)

This function will pile up the ndarray containing left and right positions, which is typically from PETrackI object. It’s useful when generating the pileup of a single chromosome is needed.

User needs to provide a numpy array of left and right positions, with dtype=[(‘l’,’i4’),(‘r’,’i4’)]. User also needs to provide a mapping function to map the left and right position to certain weight.

Return type:

_fake_callable

MACS3.Signal.PileupV2.pileup_from_LRC(LRC_array, mapping_func=<function mapping_function_always_1>)

This function will pile up the ndarray containing left and right positions and the counts, which is typically from PETrackII object. It’s useful when generating the pileup of a single chromosome is needed.

User needs to provide a numpy array of left and right positions and the counts, with dtype=[(‘l’,’i4’),(‘r’,’i4’),(‘c’,’u2’)]. User also needs to provide a mapping function to map the left and right position to certain weight.

Return type:

_fake_callable

MACS3.Signal.PileupV2.pileup_from_LR_hmmratac(LR_array, mapping_dict)
Return type:

_fake_callable

MACS3.Signal.PileupV2.pileup_from_PN(P_array, N_array, extsize)

This function will pile up the ndarray containing plus (positive) and minus (negative) positions of all reads, which is typically from FWTrackI object. It’s useful when generating the pileup of a single chromosome is needed.

Return type:

_fake_callable