[ensembl-dev] VEP ClinVar information

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Fri Mar 13 12:03:15 GMT 2015


Hi,

I'm trying to retrieve ClinVar information with the code example you 
provided.

     my $self = shift;
     my $tva = shift;
     my $vf = $tva->variation_feature;
     my $pfa = 
$self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');

     foreach my $known_var(@{$vf->{existing} || []}) {
         foreach my 
$pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})}) {
             if ($pf->{'source'} eq "dbSNP_ClinVar"){
                 print 
"$pf->{'source'}\t$pf->{'external_id'}\t$pf->{'is_significant'}\t$pf->{'phenotype'}\n", 
;
             }
         }
     }

As you can see I'm "filtering" the results to only output phenotype 
feature when source is dbSNP_ClinVar. I don't know why but I guess 
filtering should be done when doing the "fetch_all".

On the other hand I'm trying to retrieve Disease, Source and Clinical 
Significance from this example table: 
http://www.ensembl.org/Homo_sapiens/Variation/Phenotype?db=core;r=8:19955518-19956518;v=rs268;vdb=variation;vf=266

I think I'm doing something wrong I got totally lost in Phenotypefeature.

Regards,
Guillermo.

On 02/03/15 16:05, Will McLaren wrote:
> If you enable the --check_existing flag when you run the VEP, you'll 
> be able to see any known co-located variants attached to the 
> VariationFeature object in your plugin:
>
> sub run {
>   my $self = shift;
>   my $tva = shift;
>   my $vf = $tva->variation_feature;
>
>   foreach my $known_var(@{$vf->{existing} || []}) {
>      # do stuff
>   }
> }
>
> The $known_var is not an API object but a simple hashref with a number 
> of fields; you're probably interested in $known_var->{clin_sig}
>
> However, as I mentioned, this is the only data that is stored in the 
> cache. To access the rating and the specific disease association, 
> you'll need to make calls to the database by getting an adaptor, 
> something like:
>
> sub run {
>   my $self = shift;
>   my $tva = shift;
>   my $vf = $tva->variation_feature;
>   my $pfa = 
> $self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');
>
>   foreach my $known_var(@{$vf->{existing} || []}) {
>      foreach my 
> $pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})}) {
>        # do stuff
>      }
>   }
> }
>
> Be aware that this will access the database, so unless you have a 
> local copy please don't run this sort of code on genome-wide VCFs 
> using our public DB server.
>
> Regards
>
> Will
>
> On 2 March 2015 at 14:47, Guillermo Marco Puche 
> <guillermo.marco at sistemasgenomicos.com 
> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>     Hi Will,
>
>     Indeed I'm looking to retrieve this information from VEP plugin.
>
>     Regards,
>     Guillermo.
>
>
>     On 02/03/15 15:25, Will McLaren wrote:
>>     Hi Guillermo,
>>
>>     The detailed ClinVar information is stored against
>>     PhenotypeFeature objects (each SNP/disease pairing gets its own
>>     entry in ClinVar, e.g.
>>     http://www.ncbi.nlm.nih.gov/clinvar/RCV000019691.2,
>>     http://www.ncbi.nlm.nih.gov/clinvar/RCV000019692.2/,
>>     http://www.ncbi.nlm.nih.gov/clinvar/RCV000019693.2/ for rs699).
>>
>>     The rating (and indeed the clinical significance) is stored as an
>>     attribute on the PhenotypeFeature object; you can retrieve this
>>     with the get_all_attributes() method.
>>
>>     See
>>     http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PhenotypeFeature.html
>>     and
>>     http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#phenotype
>>     for more info.
>>
>>     Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig() is an
>>     internal method that you should not use.
>>
>>     The VEP cache contains the list of clinical significance states
>>     for each variant, but neither the disease association or the
>>     rating. If you want help getting access to this data via a
>>     plugin, let me know as it's a little more involved than the API
>>     methods above (though it is faster as no database access is
>>     required).
>>
>>     Regards
>>
>>     Will McLaren
>>     Ensembl Variation
>>
>>     On 2 March 2015 at 14:06, Guillermo Marco Puche
>>     <guillermo.marco at sistemasgenomicos.com
>>     <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>
>>         Dear devs,
>>
>>         I'm looking forward to retrieve ClinVar information and add
>>         it to VEP annotation. From my understanding I should be able
>>         to retrieve "Clinical significance" and "ClinVar Rating".
>>
>>         I've been looking the Varation API, and I'm confused. I guess
>>         for significance I should use
>>         Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig() or
>>         Bio::EnsEMBL::Variation::VariationFeature::get_all_clinical_significance_states().
>>
>>         What about ClinVar rating? Is it possible to retrieve it from
>>         API?
>>
>>         Thanks!
>>
>>         Regards,
>>         Guillermo.
>>
>>
>>
>>         _______________________________________________
>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>         Posting guidelines and subscribe/unsubscribe info:
>>         http://lists.ensembl.org/mailman/listinfo/dev
>>         Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
>>     _______________________________________________
>>     Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>>     Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>     Ensembl Blog:http://www.ensembl.info/
>
>     -- 
>     ------------------------------------------------------------------------
>
>     *Guillermo Marco Puche*
>
>     Bioinformatician, Computer Science Engineer
>     Sistemas Genómicos S.L.
>     Phone: +34 902 364 669 <tel:%2B34%20902%20364%20669> (Ext.777)
>     Fax: +34 902 364 670 <tel:%2B34%20902%20364%20670>
>     www.sistemasgenomicos.com <http://www.sistemasgenomicos.com>
>
>     	
>
>     <https://www.sistemasgenomicos.com/web_sg/web/areas-bioinformatica.php>
>
>     ------------------------------------------------------------------------
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-- 
Guillermo Marco Puche - Firma
------------------------------------------------------------------------

*Guillermo Marco Puche*

Bioinformatician, Computer Science Engineer
Sistemas Genómicos S.L.
Phone: +34 902 364 669 (Ext.777)
Fax: +34 902 364 670
www.sistemasgenomicos.com

	

<https://www.sistemasgenomicos.com/web_sg/web/areas-bioinformatica.php>

------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150313/3b18b3da/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 27377 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150313/3b18b3da/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bioinfo.png
Type: image/png
Size: 27377 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150313/3b18b3da/attachment-0001.png>


More information about the Dev mailing list