Usage#
Cool-Seq-Tool provides easy access to, and useful operations on, a selection of important genomic resources. Modules are divided into three groups:
Data sources, for basic acquisition and setup for a data source via Python
Data handlers, for additional operations on top of existing sources
Data mappers, for functions that incorporate multiple sources/handlers to produce output
The core CoolSeqTool
class encapsulates all of their functions and can be used for easy initialization and access:
>>> from cool_seq_tool.app import CoolSeqTool
>>> cst = CoolSeqTool()
>>> cst.seqrepo_access.translate_alias("NM_002529.3")[0][-1]
'ga4gh:SQ.RSkww1aYmsMiWbNdNnOTnVDAM3ZWp1uA'
>>> cst.transcript_mappings.ensembl_protein_for_gene_symbol["BRAF"][0]
'ENSP00000419060'
>>> await cst.uta_db.get_ac_from_gene("BRAF")
['NC_000007.14', 'NC_000007.13']
Descriptions and examples of functions can be found in the API Reference section.
Note
Many component classes in CoolSeqTool, including UtaDatabase
, ExonGenomicCoordsMapper
, and ManeTranscript
, define public methods as async
. This means that, when used inside another function, they must be called with await
:
from cool_seq_tool.app import CoolSeqTool
async def do_thing():
mane_mapper = CoolSeqTool().mane_transcript
result = mane_mapper.g_to_grch38("NC_000001.11", 100, 200)
print(type(result))
# <class 'coroutine'>
awaited_result = await result
print(awaited_result)
# {'ac': 'NC_000001.11', 'pos': (100, 200)}
In a REPL, asyncio.run()
can be used to call coroutines outside of functions. Many of our docstring examples will use this pattern.
>>> import asyncio
>>> from cool_seq_tool.app import cool_seq_tool
>>> mane_mapper = CoolSeqTool().mane_transcript
>>> result = asyncio.run(mane_mapper.g_to_grch38("NC_000001.11", 100, 200))
>>> print(result)
{'ac': 'NC_000001.11', 'pos': (100, 200)}
See the asyncio module documentation for more information.
REST server#
Core Cool-Seq-Tool functions can also be performed via a REST HTTP interface, provided via FastAPI. Use the uvicorn
shell command to start a server instance:
uvicorn cool_seq_tool.api:app
By default, uvicorn
serves to port 8000. Once initialized, go to http://localhost:8000/cool_seq_tool in a web browser for OpenAPI docs describing available endpoints.
REST routes are defined using the FastAPI APIRouter
class, meaning that they can also be mounted to other FastAPI applications:
from fastapi import FastAPI
from cool_seq_tool.routers import mane
app = FastAPI()
app.include_router(mane.router)
Environment configuration#
Individual classes will accept arguments upon initialization to set parameters regarding data sources. In general, these parameters are also configurable via environment variables, e.g. in a cloud deployment.
Variable |
Description |
---|---|
|
Path to LRG_RefSeqGene file. Used in |
|
Path to transcript mapping file generated from Ensembl BioMart. Used in |
|
Path to MANE Summary file. Used in |
|
Path to SeqRepo directory (i.e. contains |
|
A libpq connection string, i.e. of the form |
|
A path to a chainfile for lifting from GRCh37 to GRCh38. Used by |
|
A path to a chainfile for lifting from GRCh38 to GRCh37. Used by |
Schema support#
Many genomic data objects produced by Cool-Seq-Tool are structured in conformance with the Variation Representation Specification, courtesy of the VRS-Python library.