cool_seq_tool.mappers.feature_overlap#
Module for getting feature (gene/exon) overlap
- class cool_seq_tool.mappers.feature_overlap.FeatureOverlap(seqrepo_access, mane_refseq_genomic_path=None, from_local=False)[source]#
The class for getting feature overlap
- __init__(seqrepo_access, mane_refseq_genomic_path=None, from_local=False)[source]#
Initialize the FeatureOverlap class. Will load RefSeq data and store as df.
- Parameters:
seqrepo_access (
SeqRepoAccess) – Client for accessing SeqRepo datamane_refseq_genomic_path (
Optional[Path]) – Path to MANE RefSeq Genomic GFF datafrom_local (
bool) – ifTrue, don’t check for or acquire latest version – just provide most recent locally available file, if possible, and raise error otherwise
- get_grch38_mane_gene_cds_overlap(start, end, chromosome=None, identifier=None, coordinate_type=CoordinateType.RESIDUE)[source]#
Given GRCh38 genomic data, find the overlapping MANE features (gene and cds). The genomic data is specified as a sequence location by chromosome, start, end. All CDS regions with which the input sequence location has nonzero base pair overlap will be returned.
- Parameters:
start (
int) – GRCh38 start positionend (
int) – GRCh38 end positionchromosome (
Optional[str]) – Chromosome. 1..22, X, or Y. If not provided, must provide identifier. If both chromosome and identifier are provided, chromosome will be used.identifier (
Optional[str]) – Genomic identifier on GRCh38 assembly. If not provided, must provide chromosome. If both chromosome and identifier are provided, chromosome will be used.coordinate_type (
CoordinateType) – Coordinate type forstartandend
- Raises:
FeatureOverlapError – If missing required fields or unable to find associated ga4gh identifier
- Return type:
Optional[dict[str,list[CdsOverlap]]]- Returns:
MANE feature (gene/cds) overlap data represented as a dict. The dictionary will be keyed by genes which overlap the input sequence location. Each gene contains a list of the overlapping CDS regions with the beginning and end of the input sequence location’s overlap with each