[ensembl-dev] CDS Incomplete status

Will McLaren wm2 at ebi.ac.uk
Tue May 13 10:00:03 BST 2014


Hi Konrad,

Correct, these attributes are not found on the transcripts in the VEP cache.

The transcript objects are stripped down to a "need to know" state, such
that only the attributes required for calculating the consequences (and a
few key properties such as biotype) remain.

Some things are cached on the transcript under the catchily named:

$tr->{_variation_effect_feature_cache}

There are further details here:

http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#technical

I can look at adding in those particular attributes for the next VEP
release.

I'm assuming you're writing a plugin or similar? In the meantime, you could
trick the transcript into loading them from the database (assuming you're
not using --offline) with something like the code below.

Hope that helps

Will McLaren
Ensembl Variation

my $tr = $tva->transcript();

# delete existing cache
delete $tr->{attributes};

# copy adaptor from config
$tr->{adaptor} = $self->{config}->{ta};

# get attributes from DB
my @attributes = @{$tr->get_all_Attributes()};


On 12 May 2014 21:22, Konrad Karczewski <konradk at broadinstitute.org> wrote:

> Hello!
>
> Thanks! This has helped quite a bit, though at an understandable cost of
> performance.
>
> In particular, the attributes do not seem to be in the VEP cache - are
> there any plans to add these? It seems like they shouldn't take up too much
> space on the scheme of things.
>
> -Konrad
>
> On Mon, Apr 28, 2014 at 8:27 AM, mag <mr6 at ebi.ac.uk> wrote:
>
>>  Hi Konrad,
>>
>> The transcript object supports a method called 'transleatable_seq'.
>> This will return the translateable part of a transcript.
>> You can then apply the length method on the returned string.
>>
>> The CDS incomplete status is only available for manually annotated genes
>> (coming from Havana).
>> They know there is a protein-coding transcript, but the evidence
>> available does not allow for full annotation of the sequence.
>> These transcripts are flagged with the attribute 'cds_start_NF' or
>> 'cds_end_NF'
>> You can retrieve this for each transcript using
>> $transcript->get_all_Attributes('cds_start_NF')
>> or retrieve all the transcripts for a given attribute using
>> $attribute_adaptor->fetch_all_by_Transcript(undef, 'cds_start_NF')
>>
>>
>> Hope that helps,
>> Magali
>>
>>
>> On 24/04/2014 21:47, Konrad Karczewski wrote:
>>
>> Hello!
>>
>>  I've been using the API to get the length of the coding portion of a
>> transcript and I think I figured out the best way is:
>>
>>  my $transcript_cds_length =
>> $transcript_variation->transcript->cdna_coding_end -
>> $transcript_variation->transcript->cdna_coding_start + 1;
>>
>>  However, for some transcripts, this number is not a multiple of 3
>> (though it is approximately correct - within 2 bp of the number of AA's of
>> the transcript * 3). It seems to happen when there is a "CDS Incomplete"
>> status on the transcript (e.g.
>> http://www.ensembl.org/Homo_sapiens/Transcript/Summary?g=ENSG00000162458;r=1:16084441-16091522;t=ENST00000510929).
>> I would have thought if the CDS were incomplete then "cdna_coding_end" or
>> "cdna_coding_start" would be undefined - is there another way to get the
>> CDS Incomplete status?
>>
>>  And side note, is this the best way to get transcript length? I
>> couldn't seem to find a direct reference to the length.
>>
>>   Thanks!
>> -Konrad
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140513/68f67dab/attachment.html>


More information about the Dev mailing list