[ensembl-dev] Prediction of consequence type for novel variants

Pontus Larsson Pontus.Larsson at ebi.ac.uk
Tue Dec 14 15:17:24 GMT 2010


Hi Sung,

That would depend on the size of the deletion. A 1 bp deletion has start 
= end. Having start 1 less than end would mean a 2 bp deletion.

/Pontus


On 14/12/2010 15:10, Sung Gong wrote:
> Start 1 smaller than end for a deletion?
>
>
> On 14 December 2010 15:03, Will McLaren<wm2 at ebi.ac.uk>  wrote:
>> Hi Sung,
>>
>> The coordinates would be the same regardless of the strand.
>>
>> Start is _always_ 1 greater than end for an insertion, regardless of
>> strand or the size of the insertion.
>>
>> Will
>>
>> On 14 December 2010 14:58, Sung Gong<sung at bio.cc>  wrote:
>>> Hi Will,
>>>
>>> One more question about start/end positions in case of indels.
>>>
>>> In the API document
>>> (http://www.ensembl.org/info/docs/Pdoc/ensembl-variation/modules/Bio/EnsEMBL/Variation/VariationFeature.html),
>>> it says:
>>>     # Variation feature representing a 2bp insertion
>>>     $vf = Bio::EnsEMBL::Variation::VariationFeature->new
>>>        (-start   =>  1522,
>>>         -end     =>  1521, # end = start-1 for insert
>>>         -strand  =>  -1,
>>>         -slice   =>  $slice,
>>>         -allele_string =>  '-/AA',
>>>         -variation_name =>  'rs12111',
>>>         -map_weight  =>  1,
>>>         -variation =>  $v2);
>>>
>>> The example above is only for -1 strand?
>>> How can I generalise to set -start and -end?
>>>
>>> Cheers,
>>> Sung
>>>
>>> On 10 December 2010 11:41, Will McLaren<wm2 at ebi.ac.uk>  wrote:
>>>> Hi Sung
>>>>
>>>> The codons() method will work; it returns the codon something like:
>>>>
>>>> aGa/aCa
>>>>
>>>> where the base changed is in capital letters.
>>>>
>>>> Will
>>>>
>>>> On 10 December 2010 11:26, Sung Gong<sung at bio.cc>  wrote:
>>>>> Hi Will,
>>>>>
>>>>> Thanks for the paper. I appreciate your work.
>>>>>
>>>>> Before aware of your script, I used to get the corresponding codon and
>>>>> the position (0, 1 or 2) where a single DNA variant occur using the
>>>>> core API.
>>>>> Any work-around for this?
>>>>>
>>>>> I found a 'codons' method from 'TranscriptVariation', but it is a
>>>>> method of ConsequenceType?
>>>>>
>>>>> Thought better to ask you before going further.
>>>>>
>>>>> Cheers,
>>>>> Sung
>>>>>
>>>>> On 9 December 2010 14:02, Will McLaren<wm2 at ebi.ac.uk>  wrote:
>>>>>> Hi Sung,
>>>>>>
>>>>>> There is a publication referring to the system, but it does not go
>>>>>> into great detail on the internal workings:
>>>>>>
>>>>>> http://bioinformatics.oxfordjournals.org/content/26/16/2069.abstract
>>>>>>
>>>>>> Here's an approximate flow of what happens in the API. The vast
>>>>>> majority of the code used is in the Core module
>>>>>> Bio::EnsEMBL::Utils::TranscriptAlleles.pm, mainly the methods
>>>>>> type_variation() and apply_aa_change():
>>>>>>
>>>>>> - find overlapping transcripts (using $vf->feature_Slice and
>>>>>> $slice->get_all_Transcripts), then for each transcript:
>>>>>>
>>>>>> - get transcript mapper and map variation's coordinates to cDNA, CDS and peptide
>>>>>>
>>>>>> - any variants that don't fall in the coding sequence are classified
>>>>>> here (e.g. INTRONIC, UPSTREAM) and the flow ends
>>>>>>
>>>>>> - if variation falls in exon (i.e. has defined CDS coordinates),
>>>>>> generate alternative codon(s) and resulting translation
>>>>>>
>>>>>> - compare translation to reference; classify as e.g.
>>>>>> SYNONYMOUS_CODING, NON_SYNONYMOUS_CODING
>>>>>>
>>>>>> We are currently working on an overhaul to this system which should
>>>>>> make it easier to comprehend by following the code.
>>>>>>
>>>>>> I would recommend trying to follow through the code in Perl's
>>>>>> debugger, using the "perl -d" option.
>>>>>>
>>>>>> Hope this helps
>>>>>>
>>>>>> Will McLaren
>>>>>> Ensembl Variation
>>>>>>
>>>>>> On 9 December 2010 13:19, Sung Gong<sung at bio.cc>  wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I was thrilled to find that Ensembl API provides a nice script
>>>>>>> (ftp://ftp.ensembl.org/pub/misc-scripts/) which can predict the
>>>>>>> consequence types of novel variations.
>>>>>>> Also, good to see a good demonstration how to use the API for that purpose:
>>>>>>> http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html
>>>>>>>
>>>>>>> Before realising the variation API can help predicting consequence
>>>>>>> type of novel variants, I used to use only core API to map the
>>>>>>> position of my variants to see whether they are within coding region,
>>>>>>> intron, exon and so on.
>>>>>>> Now, I wondered how the variation API works for that purpose - looked
>>>>>>> at the source code, but found it is somewhat overwhelming.
>>>>>>>
>>>>>>> Can anybody explain how the novel prediction works internally under the hood?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Sung
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Dev mailing list
>>>>>>> Dev at ensembl.org
>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev





More information about the Dev mailing list