cool_seq_tool.mappers.alignment#
Module containing alignment methods for translating to and from different reference sequences.
- class cool_seq_tool.mappers.alignment.AlignmentMapper(seqrepo_access, transcript_mappings, uta_db)[source]#
Class for translating between p –> c –> g reference sequences.
- __init__(seqrepo_access, transcript_mappings, uta_db)[source]#
Initialize the AlignmentMapper class.
- Parameters:
seqrepo_access (
SeqRepoAccess
) – Access to seqrepo queriestranscript_mappings (
TranscriptMappings
) – Access to transcript accession mappings and conversionsuta_db (
UtaDatabase
) – UtaDatabase instance to give access to query UTA database
- async c_to_g(c_ac, c_start_pos, c_end_pos, cds_start=None, residue_mode=ResidueMode.RESIDUE, target_genome_assembly=Assembly.GRCH38)[source]#
Translate cDNA representation to genomic representation
- Parameters:
c_ac (
str
) – cDNA RefSeq accessionc_start_pos (
int
) – cDNA start position for codonc_end_pos (
int
) – cDNA end position for codoncoding_start_site – Coding start site. If not provided, this will be computed.
target_genome_assembly (
bool
) – Genome assembly to get genomic data for
- Return type:
Tuple
[Optional
[Dict
],Optional
[str
]]- Returns:
Tuple containing:
Genomic representation (ac, positions) if able to translate. Will return positions as inter-residue coordinates. Else
None
.Warning, if unable to translate to genomic representation. Else
None
- async p_to_c(p_ac, p_start_pos, p_end_pos, residue_mode=ResidueMode.RESIDUE)[source]#
Translate protein representation to cDNA representation.
- Parameters:
p_ac (
str
) – Protein RefSeq accessionp_start_pos (
int
) – Protein start positionp_end_pos (
int
) – Protein end positionresidue_mode (
ResidueMode
) – Residue mode forp_start_pos
andp_end_pos
- Return type:
Tuple
[Optional
[Dict
],Optional
[str
]]- Returns:
Tuple containing:
cDNA representation (accession, codon range positions for corresponding change, cds start site) if able to translate. Will return positions as inter-residue coordinates. If unable to translate, returns
None
.Warning, if unable to translate to cDNA representation. Else
None
- async p_to_g(p_ac, p_start_pos, p_end_pos, residue_mode=ResidueMode.INTER_RESIDUE, target_genome_assembly=Assembly.GRCH38)[source]#
Translate protein representation to genomic representation, by way of intermediary conversion into cDNA coordinates.
- Parameters:
p_ac (
str
) – Protein RefSeq accessionp_start_pos (
int
) – Protein start positionp_end_pos (
int
) – Protein end positionresidue_mode (
ResidueMode
) – Residue mode forp_start_pos
andp_end_pos
.target_genome_assembly (
Assembly
) – Genome assembly to get genomic data for
- Return type:
Tuple
[Optional
[Dict
],Optional
[str
]]- Returns:
Tuple containing:
Genomic representation (ac, positions) if able to translate. Will return positions as inter-residue coordinates. Else
None
.Warnings, if conversion to cDNA or genomic coordinates fails.