MACS3.Signal.Region module

Region class

class MACS3.Signal.Region.Regions

Bases: object

Container for genomic regions organized by chromosome.

This class stores genomic regions (intervals) organized by chromosome and provides operations for region manipulation including sorting, merging overlaps, expanding, and set operations like intersection and exclusion.

regions

Dictionary mapping chromosome names (bytes) to lists of region tuples (start, end). Regions are 0-based, half-open intervals [start, end).

Type:

dict

total

Total count of regions across all chromosomes.

Type:

int

add_loc(chrom, start, end)

Add a region to a specific chromosome.

Adds a single region interval to the specified chromosome. The region is added without immediate sorting. Call sort() to maintain order.

Parameters:
  • chrom (bytes) – Chromosome name.

  • start (int) – Start position (0-based, inclusive).

  • end (int) – End position (0-based, exclusive).

Returns:

None

exclude(regions_object2)

Remove regions that overlap with another Regions object.

Removes all regions from this object that overlap with regions in regions_object2. Regions present only in this object are kept. The object is modified in place.

Parameters:

regions_object2 (Regions) – Another Regions object defining regions to exclude.

Returns:

Modifies the object in place.

Return type:

None

Examples

from MACS3.Signal.Region import Regions
r1 = Regions()
r1.add_loc(b'chr1', 1000, 3000)
r1.add_loc(b'chr1', 5000, 7000)

r2 = Regions()
r2.add_loc(b'chr1', 2000, 6000)  # Overlaps both regions

r1.exclude(r2)
print(r1.regions[b'chr1'])
# Output: [(1000, 2000), (6000, 7000)]
expand(flanking)

Expand all regions by a fixed distance in both directions.

Extends the start and end positions of each region by ‘flanking’ base pairs. Start positions are expanded leftward but capped at 0. The regions are automatically re-sorted after expansion.

Parameters:

flanking (int) – Number of base pairs to expand in each direction.

Returns:

Modifies the object in place.

Return type:

None

Examples

regions.expand(100)
print(regions.regions[b'chr1'])  # Output: [(900, 2100)]

# With capping at 0 for start positions
regions2 = Regions()
regions2.add_loc(b'chr1', 50, 150)
regions2.expand(100)
print(regions2.regions[b'chr1'])
# Output: [(0, 250)]  # Start capped at 0
get_chr_names()

Get all chromosome names present in the Regions object.

Returns:

Set of chromosome names (bytes) present in regions.

Return type:

set

Examples

from MACS3.Signal.Region import Regions
regions = Regions()
regions.add_loc(b'chr1', 1000, 2000)
regions.add_loc(b'chr1', 3000, 4000)
regions.add_loc(b'chr2', 5000, 6000)
print(regions.get_chr_names())
# Output: {b'chr1', b'chr2'}
init_from_PeakIO(peaks)

Initialize the object with a PeakIO object.

Note: I intentionally forgot to check if peakio is actually a PeakIO…

intersect(regions_object2)

Get the intersection with another Regions object.

Returns a new Regions object containing only the regions that overlap between this object and regions_object2. For chromosomes present only in this object, all regions are included unchanged.

Parameters:

regions_object2 (Regions) – Another Regions object to intersect with.

Returns:

New object containing intersecting regions.

Return type:

Regions

Examples

r1 = Regions()
r1.add_loc(b'chr1', 1000, 3000)

r2 = Regions()
r2.add_loc(b'chr1', 2000, 4000)

result = r1.intersect(r2)
print(result.regions[b'chr1'])  # Output: [(2000, 3000)]
merge_overlap()

Merge overlapping or adjacent regions within each chromosome.

Combines regions that overlap or are adjacent. After merging, regions are re-sorted. This operation is idempotent and modifies the object in place. The total count is updated.

Returns:

True if merge was performed, False if already merged.

Return type:

bool

Examples

regions = Regions()
regions.add_loc(b'chr1', 1000, 2000)
regions.add_loc(b'chr1', 1500, 3000)  # Overlaps with first
result = regions.merge_overlap()
print(regions.regions[b'chr1'])  # Output: [(1000, 3000)]
pop(n)

Remove and return the first n regions.

Removes the first n regions from the Regions object across chromosomes in sorted order and returns them as a new Regions object. The current object is modified.

Parameters:

n (int) – Number of regions to pop.

Returns:

New Regions object containing the first n regions.

Return type:

Regions

Examples

regions.add_loc(b'chr1', 1000, 2000)
regions.add_loc(b'chr1', 3000, 4000)
regions.add_loc(b'chr2', 5000, 6000)
regions.sort()
first_two = regions.pop(2)
print(first_two.total)  # Output: 2
print(regions.total)    # Output: 1
regions = None
sort()

Sort all regions within each chromosome by start position.

Sorts regions for each chromosome. This method is idempotent— calling it multiple times has the same effect as calling once.

Returns:

None

total = None
total_length()

Calculate the total length covered by all regions.

Returns the sum of lengths of all regions across all chromosomes. Overlapping regions are first merged to avoid double-counting.

Returns:

Total nucleotide length covered by all regions.

Return type:

int

write_to_bed(fhd)

Write regions to a BED format file.

Parameters:

fhd – File handle opened for writing.

Examples

with open('output.bed', 'w') as f:
    regions.write_to_bed(f)
MACS3.Signal.Region.bool(*args, **kwargs)