[ensembl-dev] ensembl mirror search behavior isn't same as official page ?

Stephen Trevanion st3 at sanger.ac.uk
Fri Apr 20 15:47:35 BST 2012


Hi,

You are seeing this difference because your mirror is using 'Unisearch' 
which works by querying the databases directly us SQL, whereas the 
public Ensembl sites use a customised Lucene search. The advantages of 
the latter are in speed, the ability to do multispecies searches, the 
option to index far more fields we could do with SQL, and the ability to 
search other parts of Ensembl, for example the documentation pages. At 
present it would be far from trivial from you to set up your own lucene 
search engine, so I would recommend that if possible that you continue 
with Unisearch. However if you find that there are extra features that 
are not dealt with by our code then the place to look is in 
modules/EnsEMBL/Web/Factory/Search.pm.

With regard to your specific questioin about BRCA2 - whereas Lucene 
returns just four Ensembl genes, Unisearch is also retrieving three Vega 
genes (excluding Vega genes from search is a deliberate decision on our 
part). ENSG00000107949 is not being returned by Unisearch since the 
query term is the description of an xref, which is not a field that 
Unsearch looks at.

Hope this helps,

Regards,

Steve

On 04/19/12 08:07, ?? wrote:
> Hi All
>
> My ensembl home page search does not work the same way as the official 
> ensembl home page. Can you make them identical ?
>
> 1. I must indicate species name and gene name to perform data search, 
> but official page just requires gene name.
> 2. if search human brac2 gene, the results from both pages are different.
>
> mirror result:
>
> Your search for BRCA2 returned 7 hits.
> Please note that because this site uses a direct MySQL search, we 
> limit the search to 10 results per category and search term, in order 
> to avoid overloading the database server.
> Gene or Gene Product
>
> 7 entries matched your search strings.
> Gene: ENSG00000139618 [Region in detail]
> BRCA2 - breast cancer 2, early onset [Source:HGNC Symbol;Acc:1101]
> Gene: ENSG00000170037 [Region in detail]
> OTTHUMG00000172932 - centrobin, centrosomal BRCA2 interacting protein 
> [Source:HGNC Symbol;Acc:29616]
> Gene: ENSG00000083093 [Region in detail]
> OTTHUMG00000177097 - partner and localizer of BRCA2 [Source:HGNC 
> Symbol;Acc:26144]
> Transcript: ENST00000380152 [Region in detail]
> Gene: OTTHUMG00000017411 [Region in detail]
> BRCA2 - breast cancer 2, early onset
> Gene: OTTHUMG00000163184 [Region in detail]
> OTTHUMG00000163184 - BRCA1/BRCA2-containing complex, subunit 3 (BRCC3) 
> pseudogene
> Gene: OTTHUMG00000019237 [Region in detail]
> OTTHUMG00000019237 - BRCA2 and CDKN1A interacting protein]
>
> official result:
>
> 4 Genes match your query ('brca2') in Human
>
> BRCA2 [ Ensembl/Havana merge: ENSG00000139618 ]
> Description
> breast cancer 2, early onset [Source:HGNC Symbol;Acc:1101] [Type: 
> protein coding Ensembl/Havana merge]
> Location
> 13:32889611-32973805:1
> Variations
> ENSG00000139618
> Source
> e66
> PALB2 [ Ensembl/Havana merge: ENSG00000083093 ]
> Description
> partner and localizer of BRCA2 [Source:HGNC Symbol;Acc:26144] [Type: 
> protein coding Ensembl/Havana merge]
> Location
> 16:23614488-23652631:-1
> Variations
> ENSG00000083093
> Source
> e66
> BCCIP [ Ensembl/Havana merge: ENSG00000107949 ]
> Description
> BRCA2 and CDKN1A interacting protein [Source:HGNC Symbol;Acc:978] 
> [Type: protein coding Ensembl/Havana merge]
> Location
> 10:127512115-127542264:1
> Variations
> ENSG00000107949
> Source
> e66
> CNTROB [ Ensembl/Havana merge: ENSG00000170037 ]
> Description
> centrobin, centrosomal BRCA2 interacting protein [Source:HGNC 
> Symbol;Acc:29616] [Type: protein coding Ensembl/Havana merge]
> Location
> 17:7835473-7852896:1
> Variations
> ENSG00000170037
> Source
> e66
>
> Cheers
> -- 
> Gang Chen
> TILSI
> Taicang Institute For Life Science Information
> Address: A2/162, Renmin South Road, Taicang, 215400, Jiangsu Province, 
> P.R.China
> Phone:(+86)512-82782588
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>   




More information about the Dev mailing list