[ensembl-dev] VEP ClinVar information

Will McLaren wm2 at ebi.ac.uk
Mon Mar 16 15:16:42 GMT 2015


Hi Guillermo,

To get the rest of that data in the table you need to access the additional
attributes of the PhenotypeFeature object, something like:

my $attr = $pfs->[0]->get_all_attributes;
print "$_:".$attr->{$_}."\t" for keys %$attr;
print "\n;

Regards

Will

More info: the reason these data are stored as attributes is due to the
diverse data sources and types that we import into our phenotype schema; to
create a database column and corresponding API method for each data type
(p-value, review status, risk allele, external ID etc etc) would be
cumbersome and inefficient. To this end we provide a few methods that
shortcut the attribute approach for the most common data types; everything
else must be accessed through the attributes method. This is a common theme
across the Ensembl API.

On 13 March 2015 at 12:03, Guillermo Marco Puche <
guillermo.marco at sistemasgenomicos.com> wrote:

>  Hi,
>
> I'm trying to retrieve ClinVar information with the code example you
> provided.
>
>     my $self = shift;
>     my $tva = shift;
>     my $vf = $tva->variation_feature;
>     my $pfa =
> $self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');
>
>     foreach my $known_var(@{$vf->{existing} || []}) {
>         foreach my
> $pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})}) {
>             if ($pf->{'source'} eq "dbSNP_ClinVar"){
>                 print
> "$pf->{'source'}\t$pf->{'external_id'}\t$pf->{'is_significant'}\t$pf->{'phenotype'}\n",
> ;
>             }
>         }
>     }
>
> As you can see I'm "filtering" the results to only output phenotype
> feature when source is dbSNP_ClinVar. I don't know why but I guess
> filtering should be done when doing the "fetch_all".
>
> On the other hand I'm trying to retrieve Disease, Source and Clinical
> Significance from this example table:
> http://www.ensembl.org/Homo_sapiens/Variation/Phenotype?db=core;r=8:19955518-19956518;v=rs268;vdb=variation;vf=266
>
> I think I'm doing something wrong I got totally lost in Phenotypefeature.
>
> Regards,
> Guillermo.
>
>
> On 02/03/15 16:05, Will McLaren wrote:
>
> If you enable the --check_existing flag when you run the VEP, you'll be
> able to see any known co-located variants attached to the VariationFeature
> object in your plugin:
>
>  sub run {
>   my $self = shift;
>   my $tva = shift;
>   my $vf = $tva->variation_feature;
>
>    foreach my $known_var(@{$vf->{existing} || []}) {
>      # do stuff
>   }
> }
>
>  The $known_var is not an API object but a simple hashref with a number
> of fields; you're probably interested in $known_var->{clin_sig}
>
>  However, as I mentioned, this is the only data that is stored in the
> cache. To access the rating and the specific disease association, you'll
> need to make calls to the database by getting an adaptor, something like:
>
>  sub run {
>   my $self = shift;
>   my $tva = shift;
>   my $vf = $tva->variation_feature;
>   my $pfa =
> $self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');
>
>    foreach my $known_var(@{$vf->{existing} || []}) {
>      foreach my
> $pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})}) {
>        # do stuff
>      }
>   }
> }
>
>  Be aware that this will access the database, so unless you have a local
> copy please don't run this sort of code on genome-wide VCFs using our
> public DB server.
>
>  Regards
>
>  Will
>
> On 2 March 2015 at 14:47, Guillermo Marco Puche <
> guillermo.marco at sistemasgenomicos.com> wrote:
>
>>  Hi Will,
>>
>> Indeed I'm looking to retrieve this information from VEP plugin.
>>
>> Regards,
>> Guillermo.
>>
>>
>> On 02/03/15 15:25, Will McLaren wrote:
>>
>> Hi Guillermo,
>>
>>  The detailed ClinVar information is stored against PhenotypeFeature
>> objects (each SNP/disease pairing gets its own entry in ClinVar, e.g.
>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019691.2,
>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019692.2/,
>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019693.2/ for rs699).
>>
>>  The rating (and indeed the clinical significance) is stored as an
>> attribute on the PhenotypeFeature object; you can retrieve this with the
>> get_all_attributes() method.
>>
>>  See
>> http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PhenotypeFeature.html
>> and
>> http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#phenotype
>> for more info.
>>
>>  Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig() is an internal
>> method that you should not use.
>>
>>  The VEP cache contains the list of clinical significance states for
>> each variant, but neither the disease association or the rating. If you
>> want help getting access to this data via a plugin, let me know as it's a
>> little more involved than the API methods above (though it is faster as no
>> database access is required).
>>
>>  Regards
>>
>>  Will McLaren
>> Ensembl Variation
>>
>> On 2 March 2015 at 14:06, Guillermo Marco Puche <
>> guillermo.marco at sistemasgenomicos.com> wrote:
>>
>>>  Dear devs,
>>>
>>> I'm looking forward to retrieve ClinVar information and add it to VEP
>>> annotation. From my understanding I should be able to retrieve "Clinical
>>> significance" and "ClinVar Rating".
>>>
>>> I've been looking the Varation API, and I'm confused. I guess for
>>> significance I should use
>>> Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig() or
>>> Bio::EnsEMBL::Variation::VariationFeature::get_all_clinical_significance_states().
>>>
>>> What about ClinVar rating? Is it possible to retrieve it from API?
>>>
>>> Thanks!
>>>
>>> Regards,
>>> Guillermo.
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>  --
>>  ------------------------------
>>
>> *Guillermo Marco Puche*
>>
>> Bioinformatician, Computer Science Engineer
>> Sistemas Genómicos S.L.
>> Phone: +34 902 364 669 <%2B34%20902%20364%20669> (Ext.777)
>> Fax: +34 902 364 670 <%2B34%20902%20364%20670>
>> www.sistemasgenomicos.com
>>
>>  <https://www.sistemasgenomicos.com/web_sg/web/areas-bioinformatica.php>
>>    ------------------------------
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
> --
>  ------------------------------
>
> *Guillermo Marco Puche*
>
> Bioinformatician, Computer Science Engineer
> Sistemas Genómicos S.L.
> Phone: +34 902 364 669 (Ext.777)
> Fax: +34 902 364 670
> www.sistemasgenomicos.com
>
>  <https://www.sistemasgenomicos.com/web_sg/web/areas-bioinformatica.php>
>    ------------------------------
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150316/2141f889/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 27377 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150316/2141f889/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bioinfo.png
Type: image/png
Size: 27377 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150316/2141f889/attachment-0001.png>


More information about the Dev mailing list