[ensembl-dev] Uniprot release / CRC64

Magali Ruffier mr6 at sanger.ac.uk
Tue Jul 5 12:58:56 BST 2011

Hi Sébastien,

You will not necessarily find it for each species, but in more recent 
releases, we have tried to include information about the Uniprot and 
Refseq releases used for each evidence.
If you query the databases, you can find this information in the 
analysis table, for your analysis of interest.

We do not store a checksum for protein entries, as the way we retrieve 
the data can vary and there might not be any.
For more recent builds though, you can find a more detailed genebuild 
documentation which will tell you, for each source of evidence used, how 
many sequences were retrieved to start with, how many aligned and how 
many were actually used in the final gene set.

Hope that helps.


Sébastien Moretti wrote:
> Hi
> do you store somewhere Uniprot release, and other db releases, you 
> used for each ensembl release ?
> Also, do you store the checksum (CRC64) for protein entries you used 
> to build an ensembl release ?
> Regards

More information about the Dev mailing list