[ensembl-dev] VEP ClinVar information
Guillermo Marco Puche
guillermo.marco at sistemasgenomicos.com
Thu Mar 26 10:43:21 GMT 2015
Hello Will,
I already had enabled "check_existing" on my VEP config template,
however I followed your advice and updated code to force in the new()
method with your code.
I'm still getting no prints of line 64:
printDumper($pf->phenotype());
Are you getting any output printed? As I said I get no errors but
nothing is printed neither. This data dumper should be printing result
of phenotype() method call.
Regards,
Guillermo.
On 26/03/15 11:05, Will McLaren wrote:
> I think perhaps you haven't enabled --check_existing; this is required
> for $vf->{existing} to get populated.
>
> You can force it on in the new() method of your plugin:
>
> $self->{config}->{check_existing} = 1;
>
> It then works for me on release/75 and release/79.
>
> Will
>
> On 25 March 2015 at 17:35, Guillermo Marco Puche
> <guillermo.marco at sistemasgenomicos.com
> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
> Hello Will,
>
> With your explanations I'm trying to call phenotype (as you said I
> was accessing the hashref directly).
> I'm using input set you linked. However my local Ensembl
> installation is v75.
>
> This is the code of the plugin:
> https://github.com/guillermomarco/vep/blob/master/Clinvar.pm
>
> I'm getting absolutelty no info nor errors. I've no idea if this
> is an issue with my database/API version or with the plugin code
> itself.
>
> Regards,
> Guillermo.
>
>
>
> On 16/03/15 17:50, Will McLaren wrote:
>> The "is_significant" field is an internal flag that doesn't
>> necessarily have the meaning you expect; it is used to
>> distinguish between genuine reported associations and e.g.
>> non-significant associations reported from genome-wide studies.
>>
>> You should not see undef for phenotype; I suspect you are
>> accessing the hashref directly ($pf->{phenotype}) rather than
>> making the method call ($pf->phenotype()).
>>
>> You could try
>> ftp://ftp.ensembl.org/pub/release-79/variation/vcf/homo_sapiens/Homo_sapiens_clinically_associated.vcf.gz
>> as a test input set.
>>
>> Will
>>
>> On 16 March 2015 at 16:39, Guillermo Marco Puche
>> <guillermo.marco at sistemasgenomicos.com
>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>
>> Hi Will,
>>
>> Thank you for your quick response! Very clarifying.
>>
>> I guess that the way to retrieve ClinVar data I posted is
>> correct. With my test dataset I've only seen "is_significant"
>> values of "1" and undef 'phenotype' values. I think I need a
>> synthetic vcf with ClinVar annotation variants to very that
>> the plugin is working.
>>
>> I've been looking on Ensembl website for a test dataset. I
>> think you don't provide any right? Correct me if I'm wrong.
>>
>> Thanks!
>>
>> Regards,
>> Guillermo.
>>
>>
>> On 16/03/15 16:16, Will McLaren wrote:
>>> Hi Guillermo,
>>>
>>> To get the rest of that data in the table you need to access
>>> the additional attributes of the PhenotypeFeature object,
>>> something like:
>>>
>>> my $attr = $pfs->[0]->get_all_attributes;
>>> print "$_:".$attr->{$_}."\t" for keys %$attr;
>>> print "\n;
>>>
>>> Regards
>>>
>>> Will
>>>
>>> More info: the reason these data are stored as attributes is
>>> due to the diverse data sources and types that we import
>>> into our phenotype schema; to create a database column and
>>> corresponding API method for each data type (p-value, review
>>> status, risk allele, external ID etc etc) would be
>>> cumbersome and inefficient. To this end we provide a few
>>> methods that shortcut the attribute approach for the most
>>> common data types; everything else must be accessed through
>>> the attributes method. This is a common theme across the
>>> Ensembl API.
>>>
>>> On 13 March 2015 at 12:03, Guillermo Marco Puche
>>> <guillermo.marco at sistemasgenomicos.com
>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>
>>> Hi,
>>>
>>> I'm trying to retrieve ClinVar information with the code
>>> example you provided.
>>>
>>> my $self = shift;
>>> my $tva = shift;
>>> my $vf = $tva->variation_feature;
>>> my $pfa =
>>> $self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');
>>>
>>> foreach my $known_var(@{$vf->{existing} || []}) {
>>> foreach my
>>> $pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})})
>>> {
>>> if ($pf->{'source'} eq "dbSNP_ClinVar"){
>>> print
>>> "$pf->{'source'}\t$pf->{'external_id'}\t$pf->{'is_significant'}\t$pf->{'phenotype'}\n",
>>> ;
>>> }
>>> }
>>> }
>>>
>>> As you can see I'm "filtering" the results to only
>>> output phenotype feature when source is dbSNP_ClinVar. I
>>> don't know why but I guess filtering should be done when
>>> doing the "fetch_all".
>>>
>>> On the other hand I'm trying to retrieve Disease, Source
>>> and Clinical Significance from this example table:
>>> http://www.ensembl.org/Homo_sapiens/Variation/Phenotype?db=core;r=8:19955518-19956518;v=rs268;vdb=variation;vf=266
>>>
>>> I think I'm doing something wrong I got totally lost in
>>> Phenotypefeature.
>>>
>>> Regards,
>>> Guillermo.
>>>
>>>
>>> On 02/03/15 16:05, Will McLaren wrote:
>>>> If you enable the --check_existing flag when you run
>>>> the VEP, you'll be able to see any known co-located
>>>> variants attached to the VariationFeature object in
>>>> your plugin:
>>>>
>>>> sub run {
>>>> my $self = shift;
>>>> my $tva = shift;
>>>> my $vf = $tva->variation_feature;
>>>>
>>>> foreach my $known_var(@{$vf->{existing} || []}) {
>>>> # do stuff
>>>> }
>>>> }
>>>>
>>>> The $known_var is not an API object but a simple
>>>> hashref with a number of fields; you're probably
>>>> interested in $known_var->{clin_sig}
>>>>
>>>> However, as I mentioned, this is the only data that is
>>>> stored in the cache. To access the rating and the
>>>> specific disease association, you'll need to make calls
>>>> to the database by getting an adaptor, something like:
>>>>
>>>> sub run {
>>>> my $self = shift;
>>>> my $tva = shift;
>>>> my $vf = $tva->variation_feature;
>>>> my $pfa =
>>>> $self->{config}->{reg}->get_adaptor('human','variation','phenotypefeature');
>>>>
>>>> foreach my $known_var(@{$vf->{existing} || []}) {
>>>> foreach my
>>>> $pf(@{$pfa->fetch_all_by_object_id($known_var->{variation_name})})
>>>> {
>>>> # do stuff
>>>> }
>>>> }
>>>> }
>>>>
>>>> Be aware that this will access the database, so unless
>>>> you have a local copy please don't run this sort of
>>>> code on genome-wide VCFs using our public DB server.
>>>>
>>>> Regards
>>>>
>>>> Will
>>>>
>>>> On 2 March 2015 at 14:47, Guillermo Marco Puche
>>>> <guillermo.marco at sistemasgenomicos.com
>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>
>>>> Hi Will,
>>>>
>>>> Indeed I'm looking to retrieve this information
>>>> from VEP plugin.
>>>>
>>>> Regards,
>>>> Guillermo.
>>>>
>>>>
>>>> On 02/03/15 15:25, Will McLaren wrote:
>>>>> Hi Guillermo,
>>>>>
>>>>> The detailed ClinVar information is stored against
>>>>> PhenotypeFeature objects (each SNP/disease pairing
>>>>> gets its own entry in ClinVar, e.g.
>>>>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019691.2, http://www.ncbi.nlm.nih.gov/clinvar/RCV000019692.2/,
>>>>> http://www.ncbi.nlm.nih.gov/clinvar/RCV000019693.2/ for
>>>>> rs699).
>>>>>
>>>>> The rating (and indeed the clinical significance)
>>>>> is stored as an attribute on the PhenotypeFeature
>>>>> object; you can retrieve this with the
>>>>> get_all_attributes() method.
>>>>>
>>>>> See
>>>>> http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PhenotypeFeature.html
>>>>> and
>>>>> http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#phenotype
>>>>> for more info.
>>>>>
>>>>> Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig() is
>>>>> an internal method that you should not use.
>>>>>
>>>>> The VEP cache contains the list of clinical
>>>>> significance states for each variant, but neither
>>>>> the disease association or the rating. If you want
>>>>> help getting access to this data via a plugin, let
>>>>> me know as it's a little more involved than the
>>>>> API methods above (though it is faster as no
>>>>> database access is required).
>>>>>
>>>>> Regards
>>>>>
>>>>> Will McLaren
>>>>> Ensembl Variation
>>>>>
>>>>> On 2 March 2015 at 14:06, Guillermo Marco Puche
>>>>> <guillermo.marco at sistemasgenomicos.com
>>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>>
>>>>> Dear devs,
>>>>>
>>>>> I'm looking forward to retrieve ClinVar
>>>>> information and add it to VEP annotation. From
>>>>> my understanding I should be able to retrieve
>>>>> "Clinical significance" and "ClinVar Rating".
>>>>>
>>>>> I've been looking the Varation API, and I'm
>>>>> confused. I guess for significance I should
>>>>> use
>>>>> Bio::EnsEMBL::Variation::Utils::VEP::get_clin_sig()
>>>>> or
>>>>> Bio::EnsEMBL::Variation::VariationFeature::get_all_clinical_significance_states().
>>>>>
>>>>> What about ClinVar rating? Is it possible to
>>>>> retrieve it from API?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Regards,
>>>>> Guillermo.
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dev mailing list Dev at ensembl.org
>>>>> <mailto:Dev at ensembl.org>
>>>>> Posting guidelines and subscribe/unsubscribe
>>>>> info:
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>>
>>>>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150326/3a7b6e55/attachment.html>
More information about the Dev
mailing list