[ensembl-dev] Issues with phenotypeFeature
Sarah Hunt
seh at ebi.ac.uk
Fri Feb 26 17:23:17 GMT 2016
Hi Nathalie,
There are duplicates because all variant associations are reported. If
you look at the table given by the 'Show' option on 'ALL variants with a
phenotype annotation' on the web page you will see multiple variants
sometimes have reported associations to the same trait. In one case
multiple associations are reported at different levels of significance
between a variant and a trait.
Full HGMD data is only available to registered users, so Ensembl can
only report the presence of a variant location in the HGMD database.
ClinVar accepts submissions with different levels of information, so it
is sometimes only reported that a variant is pathogenic but no phenotype
is available.
If you are not interested in significance levels and undescribed
phenotypes, they should be simple to filter out.
Best wishes,
Sarah
On 26/02/2016 16:42, nconte wrote:
> Hi Sarah,
> Another question when the script show phenotypes there will be some
> duplicate (see below), why is this ?
> and why some have phenotype is not specified (clinvar) or not
> avalaible (like with HMGD)? and how to remove these like in the website ?
> http://www.ensembl.org/Homo_sapiens/Gene/Phenotype?db=core;g=ENSG00000115963;r=2:150468195-150539011
>
>
> many thanks
>
>
> my $gene = $ga->fetch_by_stable_id('ENSG00000115963');
> this is an example with
> human Orthologue: RND3 ENSG00000115963 2 150468195
> 150539011 -1
> gene is RND3ENSG00000115963
> Assoc ClinVar ClinVar: phenotype not specified
> Assoc ClinVar ClinVar: phenotype not specified
> Assoc ClinVar ClinVar: phenotype not specified
> Assoc ClinVar ClinVar: phenotype not specified
> Assoc ClinVar ClinVar: phenotype not specified
> Assoc ClinVar ClinVar: phenotype not specified
> Assoc dbGaP Blood pressure
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP Body Height
> Assoc dbGaP BODY MASS INDEX
> Assoc dbGaP BODY MASS INDEX
> Assoc dbGaP BODY MASS INDEX
> Assoc dbGaP BODY MASS INDEX
> Assoc dbGaP Calcium
> Assoc dbGaP Calcium
> Assoc dbGaP Cholesterol, HDL
> Assoc dbGaP Glucose
> Assoc dbGaP Respiration Disorders
> Assoc dbGaP Sleep
> Assoc dbGaP Stroke
> Assoc NHGRI-EBI GWAS catalog Endometriosis
> Assoc NHGRI-EBI GWAS catalog Type 2 diabetes
>
>
>
>
> > Hi Nathalie,
>>
>> You are extracting phenotype features for the input mouse gene, not
>> the human gene, which is why you are not seeing the human phenotype
>> feature. If you change the print statement to:
>>
>> print 'gene is '.$gene->external_name() ." ". $gene->stable_id();
>>
>> You will see the 'ENSMUS' prefix on the gene name:
>>
>> human Orthologue: LEP ENSG00000174697 7 128241284
>> 128257628 1
>> gene is Lep ENSMUSG00000059201 scalar 0
>> no PHE
>>
>> We store associations as reported, so querying by both gene and
>> variant will return the most complete set of results
>>
>> fetch_all_by_Gene takes a gene object and returns associations
>> reported to the gene
>> fetch_all_by_associated_gene() takes a gene name and returns variant
>> associations in which it is mentioned
>>
>> For example:
>>
>> use Bio::EnsEMBL::Registry;
>>
>> Bio::EnsEMBL::Registry->load_registry_from_db(
>> -host=> 'ensembldb.ensembl.org', -user=>'anonymous',
>> -port=>'3306', 'db_version' => 83,);
>> Bio::EnsEMBL::Registry->set_reconnect_when_lost(1);# will help with
>> connection issues
>>
>> my $ga = Bio::EnsEMBL::Registry->get_adaptor("homo_sapiens", "core",
>> "gene");
>> my $pfh_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human',
>> 'variation', 'phenotypefeature');
>>
>> my $gene = $ga->fetch_by_stable_id('ENSG00000174697');
>>
>> if ($gene){
>> print "gene is " .$gene->external_name() ."n";
>>
>> my $pfs = $pfh_adaptor->fetch_all_by_Gene($gene);
>> foreach my $pm(@{$pfs}){
>> print
>> "Directt",$pm->source_name,"t",$pm->phenotype->description, "n";
>> }
>>
>> my $pfsvar =
>> $pfh_adaptor->fetch_all_by_associated_gene($gene->external_name());
>> foreach my $pmv(@{$pfsvar}){
>> print
>> "Assoct",$pmv->source_name,"t",$pmv->phenotype->description, "n";
>> }
>> }
>>
>> Outputs:
>>
>> gene is LEP
>> Direct Orphanet Obesity due to congenital leptin deficiency
>> Direct MIM morbid LEPTIN DEFICIENCY OR DYSFUNCTION
>> Assoc ClinVar ClinVar: phenotype not specified
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc HGMD-PUBLIC Annotated by HGMD but no phenotype
>> description is publicly available
>> Assoc Uniprot Leptin deficiency
>> Assoc OMIM LEPTIN DYSFUNCTION
>> Assoc OMIM Leptin deficiency
>> Assoc ClinVar LEPTIN DYSFUNCTION
>> Assoc ClinVar Obesity, severe, due to leptin deficiency
>> Assoc dbGaP Blood pressure
>> Assoc dbGaP Erythrocyte Count
>> Assoc NHGRI-EBI GWAS catalog Type 2 diabetes
>> Assoc dbGaP Amyotrophic lateral sclerosis
>> Assoc ClinVar ClinVar: phenotype not specified
>> Assoc ClinVar ClinVar: phenotype not specified
>>
>> Best wishes,
>>
>> Sarah
>>
>> On 26/02/2016 12:50, nconte wrote:
>>
>>> Hello
>>> I have an issue with phenotypeFeature fetch_all_by_Gene($gene)
>>> method. I am trying to retrieve phenotype from gene object and the
>>> script I have doesn't find any phenotype relating to this gene,
>>> whereas the ensembl website shows some phenotype linked with gene
>>> http://www.ensembl.org/Homo_sapiens/Gene/Phenotype?db=core;g=ENSG00000174697;r=7:128241284-128257628;t=ENST00000308868
>>> [1]
>>>
>>> my starting point is a mouse gene from which i retrieve the human
>>> orthologue, and then query the phenotype related to this orthologue.
>>> the script below should work and print no PHE for human LEP gene
>>> whereas there are phenotype linked to gene according to website
>>> output:
>>> human Orthologue: LEP ENSG00000174697 7 128241284
>>> 128257628 1
>>> gene is Lep scalar 0
>>> no PHE
>>>
>>> any advise would be great - Also is it better to find phenotypes
>>> relating to variation of a gene rather than to the gene itself in
>>> order to get the full picture of phenotype related to a specific gene?
>>> thanks
>>> Nathalie
>>>
>>> script:
>>> #!/usr/local/bin/perl
>>> use strict;
>>> use warnings;
>>> use DBI;
>>> use Bio::EnsEMBL::Registry;
>>>
>>> Bio::EnsEMBL::Registry->load_registry_from_db(
>>> -host=> 'ensembldb.ensembl.org', -user=>'anonymous',
>>> -port=>'3306', 'db_version' => 83,);
>>> Bio::EnsEMBL::Registry->set_reconnect_when_lost(1);# will help with
>>> connection issues
>>>
>>> my $gene_member_adaptor=
>>> Bio::EnsEMBL::Registry->get_adaptor("Multi", "compara", "GeneMember");
>>> my $homology_adaptor = Bio::EnsEMBL::Registry->get_adaptor("Multi",
>>> "compara", "Homology");
>>> my $pfh_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human',
>>> 'variation', 'phenotypefeature');
>>>
>>> my $gene_member =
>>> $gene_member_adaptor->fetch_by_stable_id('ENSMUSG00000059201');
>>> #get the ensembl object correspoding to this mouse stable ID
>>>
>>> if ($gene_member){
>>>
>>> #get orthologues in human
>>> my $all_homologies =
>>> $homology_adaptor->fetch_all_by_Member($gene_member, -TARGET_SPECIES
>>> => 'homo_sapiens',-METHOD_LINK_TYPE => 'ENSEMBL_ORTHOLOGUES');
>>> if (!scalar(@{$all_homologies})){
>>>
>>> print "no orthologue with ensembl orthology
>>> methodn"; ##1
>>> } # print a message if there is no corresponding
>>> mouse orthologue
>>> else{
>>> #if there is/are orthologue(s) ,find them and print annotation
>>> about these
>>> foreach my $this_homology (@{$all_homologies}){
>>> my $homologue_genes = $this_homology->gene_list();
>>> my $homgene;
>>> foreach $homgene(@{$homologue_genes}){
>>> #get human orthologues only
>>> if (defined ($homgene->genome_db->name)
>>> && $homgene->genome_db->name eq "homo_sapiens") {
>>>
>>> print join("t", "human Orthologue:
>>> ".$homgene->display_label,$homgene->stable_id,$homgene->dnafrag()->name(),
>>> $homgene->dnafrag_start,$homgene->dnafrag_end,$homgene->dnafrag_strand,
>>> "n" ) ;
>>>
>>> }
>>>
>>> }
>>> }
>>> }
>>>
>>> my $gene = $gene_member->get_Gene();
>>>
>>> if ($gene){
>>> print 'gene is '.$gene->external_name();
>>> my $pfs = $pfh_adaptor->fetch_all_by_Gene($gene);
>>> print "t", 'scalar '.scalar(@$pfs), "n";
>>> if ( scalar(@$pfs)== 0 ){print "no PHEn";}else{
>>> foreach my $pm(@{$pfs}){
>>> if ($pm){
>>>
>>> #fill with info about ensembl gene
>>> phenotype
>>>
>>> print
>>> "t",$pm->source_name,"t",$pm->phenotype->description, "n";
>>> }else {next;}
>>> }
>>> }
>>>
>>> } else{next;}
>>>
>>> }else {print 'no gene member',"n"};
>>>
>>> __END__
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev [2]
>>> Ensembl Blog: http://www.ensembl.info/ [3]
>>
>>
>>
>> Links:
>> ------
>> [1]
>> http://www.ensembl.org/Homo_sapiens/Gene/Phenotype?db=core;g=ENSG00000174697;r=7:128241284-128257628;t=ENST00000308868
>>
>> [2] http://lists.ensembl.org/mailman/listinfo/dev
>> [3] http://www.ensembl.info/
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160226/54ac4904/attachment.html>
More information about the Dev
mailing list