[ensembl-dev] Missmatch from database and website

mag mr6 at ebi.ac.uk
Mon Jul 17 15:32:52 BST 2017


Hi Mahmood,

The fetch_all_by_external_name returns a list of genes for which atxn3 
is an associated link.
For GRCh37, there are two genes which qualify, as can be seen on the 
search page:
http://grch37.ensembl.org/Homo_sapiens/Search/Results?q=atxn3;site=ensembl_all;page=1;facet_feature_type=Gene;facet_species=Human
If you check the second element of the list, you will get ENSG00000066427

For ENSG00000259634, atxn3 is not the main display name, but it has a 
link to the corresponding NCBIgene entry for atxn3.
http://grch37.ensembl.org/Homo_sapiens/Gene/Matches?db=core;g=ENSG00000259634;r=14:92523341-92575863;t=ENST00000558190

If you are only interested in genes for which atxn3 is the chosen 
symbol, you can use the fetch_all_by_display_label method instead.

However, please be aware that the fetch_all_by_display_label will still 
return a list of genes, which could have more than one element.
For example, two genes can share the same name if one is on the 
reference while the other one is on a haplotype.
There are also cases where a name is misassigned to a gene, resulting in 
a duplication. This can happen when two genes are overlapping.

Because of this, I would recommend looping through the resulting list 
rather than assume the first result is the one you want.
You can then check for various gene attributes to ensure this is the one 
you expect.


Hope that helps,
Magali


On 15/07/2017 12:55, Mahmood Naderan wrote:
> I have an update that may shed a light but I cannot figure out.
> With the command in my previous email, I see that the stableID is 
> ENSG00000259634. As I enter this ID in the web site, I see
>
> Gene: RP11-529H20.5 ENSG00000259634  . Location  Chromosome 14: 
> 92,524,896-92,525,877 reverse strand.
>
> As you can see the start and end numbers matches with my previous 
> email and its name is not ATXN3 which I requested in the command. So, 
> the question is that why fetch_all_by_external_name("atxn3") returns that.
>
> In my previous questions, Emily pointed that function may returns 
> LRGs. For me it is hard to understand since I am not an expert in that 
> field. I want to the get the main gene and not anything else.
>
> Regards,
> Mahmood
>
>
>
> On Sat, Jul 15, 2017 at 2:15 PM, Mahmood Naderan <mahmood.nt at gmail.com 
> <mailto:mahmood.nt at gmail.com>> wrote:
>
>     Hi,
>     With this code
>
>       my @genes = @{ $gene_adaptor->fetch_all_by_external_name("atxn3) };
>       my $gene  = @genes[0];
>       my $start = $gene->start();
>       my $end   = $gene->end();
>
>     I see that
>       start=92524896
>       end=92525877
>
>     However, from the website, I see
>       Chromosome 14: 92,524,896-92,572,965
>
>     As you can see, the end numbers are different.
>     http://grch37.ensembl.org/Homo_sapiens/Gene/Sequence?db=core;g=ENSG00000066427;r=14:92524896-92572965
>     <http://grch37.ensembl.org/Homo_sapiens/Gene/Sequence?db=core;g=ENSG00000066427;r=14:92524896-92572965>
>
>
>     Is there any reason for that?
>
>     Regards,
>     Mahmood
>
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170717/266f2fed/attachment.html>


More information about the Dev mailing list