cool_seq_tool.app#
Provides core CoolSeqTool class, which non-redundantly initializes all Cool-Seq-Tool data handler and mapping resources for straightforward access.
- class cool_seq_tool.app.CoolSeqTool(transcript_file_path=None, lrg_refseqgene_path=None, mane_data_path=None, uta_connection_pool=None, sr=None, force_local_files=False)[source]#
Non-redundantly initialize all Cool-Seq-Tool data resources, available under the following attribute names:
self.seqrepo_access:SeqRepoAccessself.transcript_mappings:TranscriptMappingsself.mane_transcript_mappings:ManeTranscriptMappingsself.uta_db:UtaDatabaseself.alignment_mapper:AlignmentMapperself.liftover:LiftOverself.mane_transcript:ManeTranscriptself.ex_g_coords_mapper:ExonGenomicCoordsMapper
- __init__(transcript_file_path=None, lrg_refseqgene_path=None, mane_data_path=None, uta_connection_pool=None, sr=None, force_local_files=False)[source]#
Initialize CoolSeqTool class.
Initialization with default resource locations is straightforward:
>>> from cool_seq_tool import CoolSeqTool >>> cst = CoolSeqTool()
By default, this will attempt to fetch the latest versions of static resources, which means brief FTP and HTTPS requests to NCBI servers upon initialization. To suppress this check and simply rely on the most recent locally-available data:
>>> cst = CoolSeqTool(force_local_files=True)
Note that this will raise a FileNotFoundError if no locally-available data exists.
Paths to those files can also be explicitly passed to avoid checks as well:
>>> from pathlib import Path >>> cst = CoolSeqTool( ... lrg_refseqgene_path=Path("lrg_refseqgene_20240625.tsv"), ... mane_data_path=Path("ncbi_mane_summary_1.3.txt"), ... )
If not passed explicit arguments, these locations can also be set via environment variables. See the configuration section of the docs for more information.
- Parameters:
transcript_file_path (
Optional[Path]) – The path totranscript_mapping.tsvlrg_refseqgene_path (
Optional[Path]) – The path to the LRG_RefSeqGene filemane_data_path (
Optional[Path]) – Path to RefSeq MANE summary datauta_connection_pool (
Optional[AsyncConnectionPool]) – pyscopg connection pool to UTA instance. If not provided, a lazy UTA connection will be used, meaning the connection won’t be initiated until the first attempted UTA query, and will use environment configs/library defaultssr (
Optional[SeqRepo]) – SeqRepo instance. If this is not provided, will create a new instanceforce_local_files (
bool) – ifTrue, don’t check for or try to acquire latest versions of static data files – just use most recently available, if any