[ensembl-dev] VEP ClinVar information
Guillermo Marco Puche
guillermo.marco at sistemasgenomicos.com
Wed Mar 25 17:35:51 GMT 2015
Hello Will,
With your explanations I'm trying to call phenotype (as you said I was
accessing the hashref directly).
I'm using input set you linked. However my local Ensembl installation is
v75.
This is the code of the plugin:
https://github.com/guillermomarco/vep/blob/master/Clinvar.pm
I'm getting absolutelty no info nor errors. I've no idea if this is an
issue with my database/API version or with the plugin code itself.
Regards,
Guillermo.
On 16/03/15 17:50, Will McLaren wrote:
> The "is_significant" field is an internal flag that doesn't
> necessarily have the meaning you expect; it is used to distinguish
> between genuine reported associations and e.g. non-significant
> associations reported from genome-wide studies.
>
> You should not see undef for phenotype; I suspect you are accessing
> the hashref directly ($pf->{phenotype}) rather than making the method
> call ($pf->phenotype()).
>
> You could try
> ftp://ftp.ensembl.org/pub/release-79/variation/vcf/homo_sapiens/Homo_sapiens_clinically_associated.vcf.gz
> as a test input set.
>
> Will
>
> On 16 March 2015 at 16:39, Guillermo Marco Puche
> <guillermo.marco at sistemasgenomicos.com
> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
> Hi Will,
>
> Thank you for your quick response! Very clarifying.
>
> I guess that the way to retrieve ClinVar data I posted is correct.
> With my test dataset I've only seen "is_significant" values of "1"
> and undef 'phenotype' values. I think I need a synthetic vcf with
> ClinVar annotation variants to very that the plugin is working.
>
> I've been looking on Ensembl website for a test dataset. I think
> you don't provide any right? Correct me if I'm wrong.
>
> Thanks!
>
> Regards,
> Guillermo.
>
>
> On 16/03/15 16:16, Will McLaren wrote:
>> Hi Guillermo,
>>
>> To get the rest of that data in the table you need to access the
>> additional attributes of the PhenotypeFeature object, something like:
>>
>> my $attr = $pfs->[0]->get_all_attributes;
>> print "$_:".$attr->{$_}."\t" for keys %$attr;
>> print "\n;
>>
>> Regards
>>
>> Will
>>
>> More info: the reason these data are stored as attributes is due
>> to the diverse data sources and types that we import into our
>> phenotype schema; to create a database column and corresponding
>> API method for each data type (p-value, review status, risk
>> allele, external ID etc etc) would be cumbersome and inefficient.
>> To this end we provide a few methods that shortcut the attribute
>> approach for the most common data types; everything else must be
>> accessed through the attributes method. This is a common theme
>> across the Ensembl API.
>>
>> On 13 March 2015 at 12:03, Guillermo Marco Puche
>> <guillermo.marco at sistemasgenomicos.com
>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>
>> Hi,
>>
>> I'm trying to retrieve ClinVar information with the code
>> example you provided.
>>
>> my $self = shift;
>> my $tva = shift;
>> my $vf = $tva->variation_feature;
>> my $pfa =
>> $self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');
>>
>> foreach my $known_var(@{$vf->{existing} || []}) {
>> foreach my
>> $pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})})
>> {
>> if ($pf->{'source'} eq "dbSNP_ClinVar"){
>> print
>> "$pf->{'source'}\t$pf->{'external_id'}\t$pf->{'is_significant'}\t$pf->{'phenotype'}\n",
>> ;
>> }
>> }
>> }
>>
>> As you can see I'm "filtering" the results to only output
>> phenotype feature when source is dbSNP_ClinVar. I don't know
>> why but I guess filtering should be done when doing the
>> "fetch_all".
>>
>> On the other hand I'm trying to retrieve Disease, Source and
>> Clinical Significance from this example table:
>> http://www.ensembl.org/Homo_sapiens/Variation/Phenotype?db=core;r=8:19955518-19956518;v=rs268;vdb=variation;vf=266
>>
>> I think I'm doing something wrong I got totally lost in
>> Phenotypefeature.
>>
>> Regards,
>> Guillermo.
>>
>>
>> On 02/03/15 16:05, Will McLaren wrote:
>>> If you enable the --check_existing flag when you run the
>>> VEP, you'll be able to see any known co-located variants
>>> attached to the VariationFeature object in your plugin:
>>>
>>> sub run {
>>> my $self = shift;
>>> my $tva = shift;
>>> my $vf = $tva->variation_feature;
>>>
>>> foreach my $known_var(@{$vf->{existing} || []}) {
>>> # do stuff
>>> }
>>> }
>>>
>>> The $known_var is not an API object but a simple hashref
>>> with a number of fields; you're probably interested in
>>> $known_var->{clin_sig}
>>>
>>> However, as I mentioned, this is the only data that is
>>> stored in the cache. To access the rating and the specific
>>> disease association, you'll need to make calls to the
>>> database by getting an adaptor, something like:
>>>
>>> sub run {
>>> my $self = shift;
>>> my $tva = shift;
>>> my $vf = $tva->variation_feature;
>>> my $pfa =
>>> $self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');
>>>
>>> foreach my $known_var(@{$vf->{existing} || []}) {
>>> foreach my
>>> $pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})})
>>> {
>>> # do stuff
>>> }
>>> }
>>> }
>>>
>>> Be aware that this will access the database, so unless you
>>> have a local copy please don't run this sort of code on
>>> genome-wide VCFs using our public DB server.
>>>
>>> Regards
>>>
>>> Will
>>>
>>> On 2 March 2015 at 14:47, Guillermo Marco Puche
>>> <guillermo.marco at sistemasgenomicos.com
>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>
>>> Hi Will,
>>>
>>> Indeed I'm looking to retrieve this information from VEP
>>> plugin.
>>>
>>> Regards,
>>> Guillermo.
>>>
>>>
>>> On 02/03/15 15:25, Will McLaren wrote:
>>>> Hi Guillermo,
>>>>
>>>> The detailed ClinVar information is stored against
>>>> PhenotypeFeature objects (each SNP/disease pairing gets
>>>> its own entry in ClinVar, e.g.
>>>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019691.2,
>>>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019692.2/,
>>>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019693.2/ for
>>>> rs699).
>>>>
>>>> The rating (and indeed the clinical significance) is
>>>> stored as an attribute on the PhenotypeFeature object;
>>>> you can retrieve this with the get_all_attributes() method.
>>>>
>>>> See
>>>> http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PhenotypeFeature.html
>>>> and
>>>> http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#phenotype
>>>> for more info.
>>>>
>>>> Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig() is
>>>> an internal method that you should not use.
>>>>
>>>> The VEP cache contains the list of clinical
>>>> significance states for each variant, but neither the
>>>> disease association or the rating. If you want help
>>>> getting access to this data via a plugin, let me know
>>>> as it's a little more involved than the API methods
>>>> above (though it is faster as no database access is
>>>> required).
>>>>
>>>> Regards
>>>>
>>>> Will McLaren
>>>> Ensembl Variation
>>>>
>>>> On 2 March 2015 at 14:06, Guillermo Marco Puche
>>>> <guillermo.marco at sistemasgenomicos.com
>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>
>>>> Dear devs,
>>>>
>>>> I'm looking forward to retrieve ClinVar information
>>>> and add it to VEP annotation. From my understanding
>>>> I should be able to retrieve "Clinical
>>>> significance" and "ClinVar Rating".
>>>>
>>>> I've been looking the Varation API, and I'm
>>>> confused. I guess for significance I should use
>>>> Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig()
>>>> or
>>>> Bio::EnsEMBL::Variation::VariationFeature::get_all_clinical_significance_states().
>>>>
>>>> What about ClinVar rating? Is it possible to
>>>> retrieve it from API?
>>>>
>>>> Thanks!
>>>>
>>>> Regards,
>>>> Guillermo.
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org
>>>> <mailto:Dev at ensembl.org>
>>>> Posting guidelines and subscribe/unsubscribe info:
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>>
>>>>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150325/6b13f408/attachment.html>
More information about the Dev
mailing list