[ensembl-dev] Prediction of consequence type for novel variants

Sung Gong sung at bio.cc
Tue Dec 14 15:10:14 GMT 2010


Start 1 smaller than end for a deletion?


On 14 December 2010 15:03, Will McLaren <wm2 at ebi.ac.uk> wrote:
> Hi Sung,
>
> The coordinates would be the same regardless of the strand.
>
> Start is _always_ 1 greater than end for an insertion, regardless of
> strand or the size of the insertion.
>
> Will
>
> On 14 December 2010 14:58, Sung Gong <sung at bio.cc> wrote:
>> Hi Will,
>>
>> One more question about start/end positions in case of indels.
>>
>> In the API document
>> (http://www.ensembl.org/info/docs/Pdoc/ensembl-variation/modules/Bio/EnsEMBL/Variation/VariationFeature.html),
>> it says:
>>    # Variation feature representing a 2bp insertion
>>    $vf = Bio::EnsEMBL::Variation::VariationFeature->new
>>       (-start   => 1522,
>>        -end     => 1521, # end = start-1 for insert
>>        -strand  => -1,
>>        -slice   => $slice,
>>        -allele_string => '-/AA',
>>        -variation_name => 'rs12111',
>>        -map_weight  => 1,
>>        -variation => $v2);
>>
>> The example above is only for -1 strand?
>> How can I generalise to set -start and -end?
>>
>> Cheers,
>> Sung
>>
>> On 10 December 2010 11:41, Will McLaren <wm2 at ebi.ac.uk> wrote:
>>> Hi Sung
>>>
>>> The codons() method will work; it returns the codon something like:
>>>
>>> aGa/aCa
>>>
>>> where the base changed is in capital letters.
>>>
>>> Will
>>>
>>> On 10 December 2010 11:26, Sung Gong <sung at bio.cc> wrote:
>>>> Hi Will,
>>>>
>>>> Thanks for the paper. I appreciate your work.
>>>>
>>>> Before aware of your script, I used to get the corresponding codon and
>>>> the position (0, 1 or 2) where a single DNA variant occur using the
>>>> core API.
>>>> Any work-around for this?
>>>>
>>>> I found a 'codons' method from 'TranscriptVariation', but it is a
>>>> method of ConsequenceType?
>>>>
>>>> Thought better to ask you before going further.
>>>>
>>>> Cheers,
>>>> Sung
>>>>
>>>> On 9 December 2010 14:02, Will McLaren <wm2 at ebi.ac.uk> wrote:
>>>>> Hi Sung,
>>>>>
>>>>> There is a publication referring to the system, but it does not go
>>>>> into great detail on the internal workings:
>>>>>
>>>>> http://bioinformatics.oxfordjournals.org/content/26/16/2069.abstract
>>>>>
>>>>> Here's an approximate flow of what happens in the API. The vast
>>>>> majority of the code used is in the Core module
>>>>> Bio::EnsEMBL::Utils::TranscriptAlleles.pm, mainly the methods
>>>>> type_variation() and apply_aa_change():
>>>>>
>>>>> - find overlapping transcripts (using $vf->feature_Slice and
>>>>> $slice->get_all_Transcripts), then for each transcript:
>>>>>
>>>>> - get transcript mapper and map variation's coordinates to cDNA, CDS and peptide
>>>>>
>>>>> - any variants that don't fall in the coding sequence are classified
>>>>> here (e.g. INTRONIC, UPSTREAM) and the flow ends
>>>>>
>>>>> - if variation falls in exon (i.e. has defined CDS coordinates),
>>>>> generate alternative codon(s) and resulting translation
>>>>>
>>>>> - compare translation to reference; classify as e.g.
>>>>> SYNONYMOUS_CODING, NON_SYNONYMOUS_CODING
>>>>>
>>>>> We are currently working on an overhaul to this system which should
>>>>> make it easier to comprehend by following the code.
>>>>>
>>>>> I would recommend trying to follow through the code in Perl's
>>>>> debugger, using the "perl -d" option.
>>>>>
>>>>> Hope this helps
>>>>>
>>>>> Will McLaren
>>>>> Ensembl Variation
>>>>>
>>>>> On 9 December 2010 13:19, Sung Gong <sung at bio.cc> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I was thrilled to find that Ensembl API provides a nice script
>>>>>> (ftp://ftp.ensembl.org/pub/misc-scripts/) which can predict the
>>>>>> consequence types of novel variations.
>>>>>> Also, good to see a good demonstration how to use the API for that purpose:
>>>>>> http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html
>>>>>>
>>>>>> Before realising the variation API can help predicting consequence
>>>>>> type of novel variants, I used to use only core API to map the
>>>>>> position of my variants to see whether they are within coding region,
>>>>>> intron, exon and so on.
>>>>>> Now, I wondered how the variation API works for that purpose - looked
>>>>>> at the source code, but found it is somewhat overwhelming.
>>>>>>
>>>>>> Can anybody explain how the novel prediction works internally under the hood?
>>>>>>
>>>>>> Cheers,
>>>>>> Sung
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing list
>>>>>> Dev at ensembl.org
>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>
>>>>>
>>>>
>>>
>>
>




More information about the Dev mailing list