MACS3.Signal.Region module
Region class
- class MACS3.Signal.Region.Regions
Bases:
objectContainer for genomic regions organized by chromosome.
This class stores genomic regions (intervals) organized by chromosome and provides operations for region manipulation including sorting, merging overlaps, expanding, and set operations like intersection and exclusion.
- regions
Dictionary mapping chromosome names (bytes) to lists of region tuples (start, end). Regions are 0-based, half-open intervals [start, end).
- Type:
dict
- total
Total count of regions across all chromosomes.
- Type:
int
- add_loc(chrom, start, end)
Add a region to a specific chromosome.
Adds a single region interval to the specified chromosome. The region is added without immediate sorting. Call sort() to maintain order.
- Parameters:
chrom (bytes) – Chromosome name.
start (int) – Start position (0-based, inclusive).
end (int) – End position (0-based, exclusive).
- Returns:
None
- exclude(regions_object2)
Remove regions that overlap with another Regions object.
Removes all regions from this object that overlap with regions in regions_object2. Regions present only in this object are kept. The object is modified in place.
- Parameters:
regions_object2 (Regions) – Another Regions object defining regions to exclude.
- Returns:
Modifies the object in place.
- Return type:
None
Examples
from MACS3.Signal.Region import Regions r1 = Regions() r1.add_loc(b'chr1', 1000, 3000) r1.add_loc(b'chr1', 5000, 7000) r2 = Regions() r2.add_loc(b'chr1', 2000, 6000) # Overlaps both regions r1.exclude(r2) print(r1.regions[b'chr1']) # Output: [(1000, 2000), (6000, 7000)]
- expand(flanking)
Expand all regions by a fixed distance in both directions.
Extends the start and end positions of each region by ‘flanking’ base pairs. Start positions are expanded leftward but capped at 0. The regions are automatically re-sorted after expansion.
- Parameters:
flanking (int) – Number of base pairs to expand in each direction.
- Returns:
Modifies the object in place.
- Return type:
None
Examples
regions.expand(100) print(regions.regions[b'chr1']) # Output: [(900, 2100)] # With capping at 0 for start positions regions2 = Regions() regions2.add_loc(b'chr1', 50, 150) regions2.expand(100) print(regions2.regions[b'chr1']) # Output: [(0, 250)] # Start capped at 0
- get_chr_names()
Get all chromosome names present in the Regions object.
- Returns:
Set of chromosome names (bytes) present in regions.
- Return type:
set
Examples
from MACS3.Signal.Region import Regions regions = Regions() regions.add_loc(b'chr1', 1000, 2000) regions.add_loc(b'chr1', 3000, 4000) regions.add_loc(b'chr2', 5000, 6000) print(regions.get_chr_names()) # Output: {b'chr1', b'chr2'}
- init_from_PeakIO(peaks)
Initialize the object with a PeakIO object.
Note: I intentionally forgot to check if peakio is actually a PeakIO…
- intersect(regions_object2)
Get the intersection with another Regions object.
Returns a new Regions object containing only the regions that overlap between this object and regions_object2. For chromosomes present only in this object, all regions are included unchanged.
- Parameters:
regions_object2 (Regions) – Another Regions object to intersect with.
- Returns:
New object containing intersecting regions.
- Return type:
Examples
r1 = Regions() r1.add_loc(b'chr1', 1000, 3000) r2 = Regions() r2.add_loc(b'chr1', 2000, 4000) result = r1.intersect(r2) print(result.regions[b'chr1']) # Output: [(2000, 3000)]
- merge_overlap()
Merge overlapping or adjacent regions within each chromosome.
Combines regions that overlap or are adjacent. After merging, regions are re-sorted. This operation is idempotent and modifies the object in place. The total count is updated.
- Returns:
True if merge was performed, False if already merged.
- Return type:
bool
Examples
regions = Regions() regions.add_loc(b'chr1', 1000, 2000) regions.add_loc(b'chr1', 1500, 3000) # Overlaps with first result = regions.merge_overlap() print(regions.regions[b'chr1']) # Output: [(1000, 3000)]
- pop(n)
Remove and return the first n regions.
Removes the first n regions from the Regions object across chromosomes in sorted order and returns them as a new Regions object. The current object is modified.
- Parameters:
n (int) – Number of regions to pop.
- Returns:
New Regions object containing the first n regions.
- Return type:
Examples
regions.add_loc(b'chr1', 1000, 2000) regions.add_loc(b'chr1', 3000, 4000) regions.add_loc(b'chr2', 5000, 6000) regions.sort() first_two = regions.pop(2) print(first_two.total) # Output: 2 print(regions.total) # Output: 1
- regions = None
- sort()
Sort all regions within each chromosome by start position.
Sorts regions for each chromosome. This method is idempotent— calling it multiple times has the same effect as calling once.
- Returns:
None
- total = None
- total_length()
Calculate the total length covered by all regions.
Returns the sum of lengths of all regions across all chromosomes. Overlapping regions are first merged to avoid double-counting.
- Returns:
Total nucleotide length covered by all regions.
- Return type:
int
- write_to_bed(fhd)
Write regions to a BED format file.
- Parameters:
fhd – File handle opened for writing.
Examples
with open('output.bed', 'w') as f: regions.write_to_bed(f)
- MACS3.Signal.Region.bool(*args, **kwargs)