[ensembl-dev] question

Kathryn Beal kbeal at ebi.ac.uk
Fri Apr 1 07:55:43 BST 2011


Hi,
You need to download the Compara EMF files.

>From the main download page:
http://www.ensembl.org/info/data/ftp/index.html

find "Multi-species" (near the bottom of the list) and then select "EMF". Select "ensembl-compara" and then
the alignment set you want. We have GERP scores for alignments containing human for the pecan_19_amniota and
epo_34_eutherian alignment sets. More information about the alignment sets can be found here:

http://www.ensembl.org/info/docs/compara/analyses.html

I hope that helps.

Cheers
Kathryn

> I need to obtain the conservation scores (GERP scores) and was able to
> follow the instruction on the FAQ page and download the emf files for homo
> sapiens chr16 from Ensembl ftp site.  However, I am having trouble
> understanding the data.  Here is the first 25 lines of the emf file.  And my
> questons are:
> 
> 1. What does the numbers on the first line mean?  I image that they have
> something to do with the position, but couldn't figue out how.
> 2. What are the difference between SCORE aligned Watson reads and Venter
> reads?  It seems that missing gets assigned a scoe 0, is this correct?
> There are instances where the Watson seq is the same as Venter seq, yet they
> have different scores, what is the reason?
> 3. The scores: the references that I've read, it seems that GERP scores have
> decimal points, yet the scores listed here are all integers.  How are these
> calculated?
> 
> Much thanks
> 
> SEQ human reference 16 60002 80290 1
> SEQ human Watson WGS
> SEQ human Venter WGS
> SCORE aligned Watson reads
> SCORE aligned Venter reads
> DATA
> A ~ A 0 1
> A ~ A 0 1
> C ~ C 0 1
> C ~ C 0 1
> C ~ C 0 1
> T ~ T 0 1
> A ~ A 0 1
> A ~ A 0 1
> C ~ C 0 1
> C ~ C 0 1
> C ~ C 0 1
> T ~ T 0 1
> A ~ A 0 1
> A ~ A 0 1
> C ~ C 0 1
> 
> 
> 
> 
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev


-- 
Dr Kathryn Beal
EnsEMBL
EMBL-European Bioinformatics Institute       Tel. +44 (0)1223 494458
Wellcome Trust Genome Campus, Hinxton        Fax. +44 (0)1223 494468
Cambridge CB10 1SD, UK




More information about the Dev mailing list