cool_seq_tool.sources.mane_transcript_mappings#
Provide fast tabular access to MANE summary file. Enables retrieval of associated MANE transcripts for gene symbols, genomic positions, or transcript accessions.
- class cool_seq_tool.sources.mane_transcript_mappings.ManeTranscriptMappings(mane_data_path=None, from_local=False)[source]#
Provide fast tabular access to MANE summary file.
By default, acquires data from NCBI FTP server if unavailable locally. The local data location can be passed as an argument or given under the environment variable
MANE_SUMMARY_PATH.See the NCBI MANE page for more information.
- __init__(mane_data_path=None, from_local=False)[source]#
Initialize the MANE Transcript mappings class.
- Parameters:
mane_data_path (
Optional[Path]) – Path to RefSeq MANE summary datafrom_local (
bool) – ifTrue, don’t check for or acquire latest version – just provide most recent locally available file, if possible, and raise error otherwise
- get_gene_mane_data(gene_symbol)[source]#
Return MANE Transcript data for a gene.
>>> from cool_seq_tool.sources import ManeTranscriptMappings >>> m = ManeTranscriptMappings() >>> braf_mane = m.get_gene_mane_data("BRAF") >>> braf_mane[0]["RefSeq_nuc"], braf_mane[0]["MANE_status"] ('NM_004333.6', 'MANE Select') >>> braf_mane[1]["RefSeq_nuc"], braf_mane[1]["MANE_status"] ('NM_001374258.1', 'MANE Plus Clinical')
- Parameters:
gene_symbol (str) – HGNC Gene Symbol
- Return type:
list[dict]- Returns:
List of MANE Transcript data (Transcript accessions, gene, and location information). The list is sorted so that a MANE Select entry comes first, followed by a MANE Plus Clinical entry, if available.
- get_genomic_mane_genes(ac, start, end)[source]#
Get MANE gene(s) for genomic location
- Parameters:
ac (
str) – RefSeq genomic accessionstart (
int) – Genomic start position. Assumes residue coordinates.end (
int) – Genomic end position. Assumes residue coordinates.
- Return type:
list[ManeGeneData]- Returns:
Unique MANE gene(s) found for a genomic location
- get_mane_data_from_chr_pos(alt_ac, start, end)[source]#
Get MANE data given a GRCh38 genomic position.
- Parameters:
alt_ac (str) – NC Accession
start (int) – Start genomic position. Assumes residue coordinates.
end (int) – End genomic position. Assumes residue coordinates.
- Return type:
list[dict]- Returns:
List of MANE data. Will return sorted list: MANE Select then MANE Plus Clinical.
- get_mane_from_transcripts(transcripts)[source]#
Get mane transcripts from a list of transcripts
- Parameters:
transcripts (List[str]) – RefSeq transcripts on c. coordinate
- Return type:
list[dict]- Returns:
MANE data