[ensembl-dev] Single-exon transcript problem in ensembl-analysis GeneBuilder Runnable

Alison Lee alacrity8 at gmail.com
Thu Nov 25 01:38:17 GMT 2010


Hi,

I have been inspecting the genes obtained from an in-house genebuild.
There was an odd case of a gene (TLR8) where a full-length single-exon
transcript was rejected in favor of partial multi-exon transcripts. I
have traced the problem to ensembl-analysis' Runnable
GeneBuilder->prune_Transcripts.

This is what I understand so far from the code; correct me if I am wrong:

For genes that have a combination of single-exon and multi-exon transcripts:

1. By default all transcripts are marked as "already seen", based on
"$found = 1"
2. Single-exon transcript will never enter the EXONS loop hence it's
assumed that its exon can be found in another transcript, because
$found remains as 1
3. Single-exon transcript will always be marked as rejected regardless
whether it is longer than other transcripts, based on "($found == 1 &&
$#exons == 0) {$single_exon_rejects{$tran} = $tran;}"  // possible
bug?
4. Single-exon transcript does not make it past prune_Transcripts

If the above is indeed a bug, is there a solution to this?

Alison




More information about the Dev mailing list