cool_seq_tool.mappers.alignment#
Module containing alignment methods for translating to and from different reference sequences.
- class cool_seq_tool.mappers.alignment.AlignmentMapper(seqrepo_access, transcript_mappings, uta_db)[source]#
Class for translating between p –> c –> g reference sequences.
- __init__(seqrepo_access, transcript_mappings, uta_db)[source]#
Initialize the AlignmentMapper class.
- Parameters:
seqrepo_access (
SeqRepoAccess) – Access to seqrepo queriestranscript_mappings (
TranscriptMappings) – Access to transcript accession mappings and conversionsuta_db (
UtaDatabase) – UtaDatabase instance to give access to query UTA database
- async c_to_g(c_ac, c_start_pos, c_end_pos, cds_start=None, coordinate_type=CoordinateType.RESIDUE, target_genome_assembly=Assembly.GRCH38)[source]#
Translate cDNA representation to genomic representation
- Parameters:
c_ac (
str) – cDNA RefSeq accessionc_start_pos (
int) – cDNA start position for codonc_end_pos (
int) – cDNA end position for codoncoding_start_site – Coding start site. If not provided, this will be computed.
target_genome_assembly (
Assembly) – Genome assembly to get genomic data for
- Return type:
tuple[Optional[dict],Optional[str]]- Returns:
Tuple containing:
Genomic representation (ac, positions) if able to translate. Will return positions as inter-residue coordinates. Else
None.Warning, if unable to translate to genomic representation. Else
None
- async p_to_c(p_ac, p_start_pos, p_end_pos, coordinate_type=CoordinateType.RESIDUE)[source]#
Translate protein representation to cDNA representation.
- Parameters:
p_ac (
str) – Protein RefSeq accessionp_start_pos (
int) – Protein start positionp_end_pos (
int) – Protein end positioncoordinate_type (
CoordinateType) – Coordinate type forp_start_posandp_end_pos
- Return type:
tuple[Optional[dict],Optional[str]]- Returns:
Tuple containing:
cDNA representation (accession, codon range positions for corresponding change, cds start site) if able to translate. Will return positions as inter-residue coordinates. If unable to translate, returns
None.Warning, if unable to translate to cDNA representation. Else
None
- async p_to_g(p_ac, p_start_pos, p_end_pos, coordinate_type=CoordinateType.INTER_RESIDUE, target_genome_assembly=Assembly.GRCH38)[source]#
Translate protein representation to genomic representation, by way of intermediary conversion into cDNA coordinates.
- Parameters:
p_ac (
str) – Protein RefSeq accessionp_start_pos (
int) – Protein start positionp_end_pos (
int) – Protein end positioncoordinate_type (
CoordinateType) – Coordinate type forp_start_posandp_end_pos.target_genome_assembly (
Assembly) – Genome assembly to get genomic data for
- Return type:
tuple[Optional[dict],Optional[str]]- Returns:
Tuple containing:
Genomic representation (ac, positions) if able to translate. Will return positions as inter-residue coordinates. Else
None.Warnings, if conversion to cDNA or genomic coordinates fails.