[ensembl-dev] Issues with phenotypeFeature

Sarah Hunt seh at ebi.ac.uk
Fri Feb 26 17:23:17 GMT 2016


Hi Nathalie,

There are duplicates because all variant associations are reported. If 
you look at the table given by the 'Show' option on 'ALL variants with a 
phenotype annotation' on the web page you will see multiple variants 
sometimes have reported associations to the same trait. In one case 
multiple associations are reported at different levels of significance 
between a variant and a trait.

Full HGMD data is only available to registered users, so Ensembl can 
only report the presence of a variant location in the HGMD database. 
ClinVar accepts submissions with different levels of information, so it 
is sometimes only reported that a variant is pathogenic but no phenotype 
is available.

If you are not interested in significance levels and undescribed 
phenotypes, they should be simple to filter out.

Best wishes,

Sarah

On 26/02/2016 16:42, nconte wrote:
> Hi Sarah,
> Another question when the script show phenotypes there will be some 
> duplicate (see below), why is this ?
> and why some have  phenotype is not specified (clinvar) or not 
> avalaible (like with HMGD)? and how to remove these like in the website ?
> http://www.ensembl.org/Homo_sapiens/Gene/Phenotype?db=core;g=ENSG00000115963;r=2:150468195-150539011 
>
>
> many thanks
>
>
>  my $gene = $ga->fetch_by_stable_id('ENSG00000115963');
> this is an example with
> human Orthologue: RND3    ENSG00000115963    2    150468195 
> 150539011    -1
> gene is RND3ENSG00000115963
> Assoc    ClinVar    ClinVar: phenotype not specified
> Assoc    ClinVar    ClinVar: phenotype not specified
> Assoc    ClinVar    ClinVar: phenotype not specified
> Assoc    ClinVar    ClinVar: phenotype not specified
> Assoc    ClinVar    ClinVar: phenotype not specified
> Assoc    ClinVar    ClinVar: phenotype not specified
> Assoc    dbGaP    Blood pressure
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    Body Height
> Assoc    dbGaP    BODY MASS INDEX
> Assoc    dbGaP    BODY MASS INDEX
> Assoc    dbGaP    BODY MASS INDEX
> Assoc    dbGaP    BODY MASS INDEX
> Assoc    dbGaP    Calcium
> Assoc    dbGaP    Calcium
> Assoc    dbGaP    Cholesterol, HDL
> Assoc    dbGaP    Glucose
> Assoc    dbGaP    Respiration Disorders
> Assoc    dbGaP    Sleep
> Assoc    dbGaP    Stroke
> Assoc    NHGRI-EBI GWAS catalog    Endometriosis
> Assoc    NHGRI-EBI GWAS catalog    Type 2 diabetes
>
>
>
>
>  > Hi Nathalie,
>>
>>  You are extracting phenotype features for the input mouse gene, not
>> the human gene, which is why you are not seeing the human phenotype
>> feature. If you change the print statement to:
>>
>>  print 'gene is    '.$gene->external_name() ."  ". $gene->stable_id();
>>
>>  You will see the 'ENSMUS' prefix on the gene name:
>>
>>  human Orthologue: LEP    ENSG00000174697    7    128241284
>> 128257628    1
>>  gene is    Lep  ENSMUSG00000059201    scalar  0
>>  no PHE
>>
>>  We store associations as reported, so querying by both gene and
>> variant will return the most complete set of results
>>
>>  fetch_all_by_Gene takes a gene object and returns associations
>> reported to the gene
>>  fetch_all_by_associated_gene() takes a gene name and returns variant
>> associations in which it is mentioned
>>
>>  For example:
>>
>>  use Bio::EnsEMBL::Registry;
>>
>>  Bio::EnsEMBL::Registry->load_registry_from_db(
>>      -host=>  'ensembldb.ensembl.org', -user=>'anonymous',
>>      -port=>'3306', 'db_version' => 83,);
>>  Bio::EnsEMBL::Registry->set_reconnect_when_lost(1);# will help with
>> connection issues
>>
>>  my $ga = Bio::EnsEMBL::Registry->get_adaptor("homo_sapiens", "core", 
>> "gene");
>>  my $pfh_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human',
>> 'variation', 'phenotypefeature');
>>
>>  my $gene = $ga->fetch_by_stable_id('ENSG00000174697');
>>
>>  if ($gene){
>>    print "gene is " .$gene->external_name() ."n";
>>
>>    my $pfs = $pfh_adaptor->fetch_all_by_Gene($gene);
>>    foreach my $pm(@{$pfs}){
>>      print 
>> "Directt",$pm->source_name,"t",$pm->phenotype->description, "n";
>>    }
>>
>>    my $pfsvar =
>> $pfh_adaptor->fetch_all_by_associated_gene($gene->external_name());
>>    foreach my $pmv(@{$pfsvar}){
>>      print 
>> "Assoct",$pmv->source_name,"t",$pmv->phenotype->description, "n";
>>    }
>>  }
>>
>>  Outputs:
>>
>>  gene is LEP
>>  Direct    Orphanet    Obesity due to congenital leptin deficiency
>>  Direct    MIM morbid    LEPTIN DEFICIENCY OR DYSFUNCTION
>>  Assoc    ClinVar    ClinVar: phenotype not specified
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    HGMD-PUBLIC    Annotated by HGMD but no phenotype
>> description is publicly available
>>  Assoc    Uniprot    Leptin deficiency
>>  Assoc    OMIM    LEPTIN DYSFUNCTION
>>  Assoc    OMIM    Leptin deficiency
>>  Assoc    ClinVar    LEPTIN DYSFUNCTION
>>  Assoc    ClinVar    Obesity, severe, due to leptin deficiency
>>  Assoc    dbGaP    Blood pressure
>>  Assoc    dbGaP    Erythrocyte Count
>>  Assoc    NHGRI-EBI GWAS catalog    Type 2 diabetes
>>  Assoc    dbGaP    Amyotrophic lateral sclerosis
>>  Assoc    ClinVar    ClinVar: phenotype not specified
>>  Assoc    ClinVar    ClinVar: phenotype not specified
>>
>>  Best wishes,
>>
>>  Sarah
>>
>> On 26/02/2016 12:50, nconte wrote:
>>
>>> Hello
>>> I have an issue with phenotypeFeature fetch_all_by_Gene($gene) 
>>> method. I am trying to retrieve phenotype from gene object and the 
>>> script I have doesn't find any phenotype relating to this gene, 
>>> whereas the ensembl website shows some  phenotype linked with gene
>>> http://www.ensembl.org/Homo_sapiens/Gene/Phenotype?db=core;g=ENSG00000174697;r=7:128241284-128257628;t=ENST00000308868 
>>> [1]
>>>
>>> my starting point is a mouse gene from which i retrieve the human 
>>> orthologue, and then query the phenotype related to this orthologue.
>>> the script below should work and print no PHE for human LEP gene 
>>> whereas there are phenotype linked to gene according to website
>>> output:
>>> human Orthologue: LEP    ENSG00000174697    7    128241284 
>>> 128257628    1
>>> gene is    Lep    scalar  0
>>> no PHE
>>>
>>> any advise would be great - Also is it better to find phenotypes 
>>> relating to variation of a gene rather than to the gene itself in 
>>> order to get the full picture of phenotype related to a specific gene?
>>> thanks
>>> Nathalie
>>>
>>> script:
>>> #!/usr/local/bin/perl
>>> use strict;
>>> use warnings;
>>> use DBI;
>>> use Bio::EnsEMBL::Registry;
>>>
>>> Bio::EnsEMBL::Registry->load_registry_from_db(
>>>     -host=>  'ensembldb.ensembl.org', -user=>'anonymous',
>>>     -port=>'3306', 'db_version' => 83,);
>>> Bio::EnsEMBL::Registry->set_reconnect_when_lost(1);# will help with 
>>> connection issues
>>>
>>> my $gene_member_adaptor= 
>>> Bio::EnsEMBL::Registry->get_adaptor("Multi", "compara", "GeneMember");
>>> my $homology_adaptor = Bio::EnsEMBL::Registry->get_adaptor("Multi", 
>>> "compara", "Homology");
>>> my $pfh_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human', 
>>> 'variation', 'phenotypefeature');
>>>
>>> my $gene_member = 
>>> $gene_member_adaptor->fetch_by_stable_id('ENSMUSG00000059201');
>>> #get the ensembl object correspoding to this mouse stable ID
>>>
>>>             if ($gene_member){
>>>
>>> #get orthologues in human
>>>             my $all_homologies = 
>>> $homology_adaptor->fetch_all_by_Member($gene_member, -TARGET_SPECIES 
>>> => 'homo_sapiens',-METHOD_LINK_TYPE => 'ENSEMBL_ORTHOLOGUES');
>>>                 if (!scalar(@{$all_homologies})){
>>>
>>>                 print   "no orthologue with ensembl orthology 
>>> methodn";  ##1
>>>                    } # print a message if there is no corresponding 
>>> mouse orthologue
>>>                     else{
>>>     #if there is/are orthologue(s) ,find them and print annotation 
>>> about these
>>>                     foreach my $this_homology (@{$all_homologies}){
>>>                     my $homologue_genes = $this_homology->gene_list();
>>>                     my $homgene;
>>>                         foreach $homgene(@{$homologue_genes}){
>>>                         #get human orthologues only
>>>                             if (defined ($homgene->genome_db->name) 
>>> && $homgene->genome_db->name eq "homo_sapiens") {
>>>
>>>                             print  join("t", "human Orthologue: 
>>> ".$homgene->display_label,$homgene->stable_id,$homgene->dnafrag()->name(), 
>>> $homgene->dnafrag_start,$homgene->dnafrag_end,$homgene->dnafrag_strand, 
>>> "n" ) ;
>>>
>>>                             }
>>>
>>>                         }
>>>                     }
>>>                 }
>>>
>>>                     my $gene = $gene_member->get_Gene();
>>>
>>>                     if ($gene){
>>>                     print 'gene is '.$gene->external_name();
>>>                     my $pfs = $pfh_adaptor->fetch_all_by_Gene($gene);
>>>                     print "t", 'scalar  '.scalar(@$pfs), "n";
>>>                         if ( scalar(@$pfs)== 0 ){print "no PHEn";}else{
>>>                         foreach my $pm(@{$pfs}){
>>>                             if ($pm){
>>>
>>>                             #fill with info about ensembl gene 
>>> phenotype
>>>
>>>                             print 
>>> "t",$pm->source_name,"t",$pm->phenotype->description, "n";
>>>                             }else {next;}
>>>                         }
>>>                     }
>>>
>>>                     } else{next;}
>>>
>>>             }else {print 'no gene member',"n"};
>>>
>>> __END__
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: 
>>> http://lists.ensembl.org/mailman/listinfo/dev [2]
>>> Ensembl Blog: http://www.ensembl.info/ [3]
>>
>>
>>
>> Links:
>> ------
>> [1]
>> http://www.ensembl.org/Homo_sapiens/Gene/Phenotype?db=core;g=ENSG00000174697;r=7:128241284-128257628;t=ENST00000308868 
>>
>> [2] http://lists.ensembl.org/mailman/listinfo/dev
>> [3] http://www.ensembl.info/
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: 
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160226/54ac4904/attachment.html>


More information about the Dev mailing list