Transcript Selection#

One of the core uses of Cool-Seq-Tool is to acquire and use consensus-based, representative transcripts in performing genomic analysis. Here, we describe the selection processes, programmed in the ManeTranscript class, for choosing the best available transcripts that are compatible with requested data.

We rely heavily on transcripts annotated under the Matched Annotation from NCBI and EMBL-EBI (MANE) Transcripts project. For more information on the MANE project, see the NCBI MANE page.

Transcript compatibility#

The following validation checks are performed to determine compatibility of a transcript position:

  • The position exists on an accession

  • The sequence matches the expected reference sequence for a given accession or position

  • Exon numbering matches known exon structure

A transcript that fails to pass any of these checks is discarded as incompatible.

Representative transcript priority#

All compatible transcripts are evaluated and ordered against the below criteria. The candidate transcript which meets the earliest criterion is chosen as representative.

  1. Transcript is annotated as a MANE Select transcript

  2. Transcript is annotated as a MANE Plus Clinical transcript

  3. Transcript is the longest-compatible remaining transcript

  4. Transcript is the first-published (lowest-numbered RefSeq/Ensembl accession) remaining transcript

Note

We always prefer the most recent version of a transcript associated with an assembly.