Usage#
Cool-Seq-Tool provides easy access to, and useful operations on, a selection of important genomic resources. Modules are divided into three groups:
Data sources, for basic acquisition and setup for a data source via Python
Data handlers, for additional operations on top of existing sources
Data mappers, for functions that incorporate multiple sources/handlers to produce output
The core CoolSeqTool class encapsulates all of their functions and can be used for easy initialization and access:
>>> from cool_seq_tool.app import CoolSeqTool
>>> cst = CoolSeqTool()
>>> cst.seqrepo_access.translate_alias("NM_002529.3")[0][-1]
'ga4gh:SQ.RSkww1aYmsMiWbNdNnOTnVDAM3ZWp1uA'
>>> cst.transcript_mappings.ensembl_protein_for_gene_symbol["BRAF"][0]
'ENSP00000419060'
>>> await cst.uta_db.get_ac_from_gene("BRAF")
['NC_000007.14', 'NC_000007.13']
Descriptions and examples of functions can be found in the API Reference section.
Note
Many component classes in CoolSeqTool, including UtaDatabase, ExonGenomicCoordsMapper, and ManeTranscript, define public methods as async. This means that, when used inside another function, they must be called with await:
from cool_seq_tool.app import CoolSeqTool
async def do_thing():
mane_mapper = CoolSeqTool().mane_transcript
result = mane_mapper.g_to_grch38("NC_000001.11", 100, 200)
print(type(result))
# <class 'coroutine'>
awaited_result = await result
print(awaited_result)
# {'ac': 'NC_000001.11', 'pos': (100, 200)}
In a REPL, asyncio.run() can be used to call coroutines outside of functions. Many of our docstring examples will use this pattern.
>>> import asyncio
>>> from cool_seq_tool.app import cool_seq_tool
>>> mane_mapper = CoolSeqTool().mane_transcript
>>> result = asyncio.run(mane_mapper.g_to_grch38("NC_000001.11", 100, 200))
>>> print(result)
{'ac': 'NC_000001.11', 'pos': (100, 200)}
See the asyncio module documentation for more information.
REST server#
Core Cool-Seq-Tool functions can also be performed via a REST HTTP interface, provided via FastAPI. Use the uvicorn shell command to start a server instance:
uvicorn cool_seq_tool.api:app
By default, uvicorn serves to port 8000. Once initialized, go to http://localhost:8000/cool_seq_tool in a web browser for OpenAPI docs describing available endpoints.
REST routes are defined using the FastAPI APIRouter class, meaning that they can also be mounted to other FastAPI applications:
from fastapi import FastAPI
from cool_seq_tool.routers import mane
app = FastAPI()
app.include_router(mane.router)
Environment configuration#
Individual classes will accept arguments upon initialization to set parameters regarding data sources. In general, these parameters are also configurable via environment variables, e.g. in a cloud deployment.
Variable |
Description |
|---|---|
|
Path to LRG_RefSeqGene file. Used in |
|
Path to transcript mapping file generated from Ensembl BioMart. Used in |
|
Path to MANE Summary file. Used in |
|
Path to SeqRepo directory (i.e. contains |
|
A libpq connection string, i.e. of the form |
|
A path to a chainfile for lifting from GRCh37 to GRCh38. Used by the |
|
A path to a chainfile for lifting from GRCh38 to GRCh37. Used by the |
Schema support#
Many genomic data objects produced by Cool-Seq-Tool are structured in conformance with the Variation Representation Specification, courtesy of the VRS-Python library.