[ensembl-dev] discrepancies in number of alignmentblocks between website and files, 25 eutherian mammals EPO

Christian Groß - EWI C.Gross at tudelft.nl
Mon Feb 12 14:55:49 GMT 2018


Dear Dev Team,

I am writing you because of a discrepancy between the number of alignment blocks stated on your website and the number of alignment blocks which I can find in the associated files.

On your website you list the details for the 25 eutherian mammals EPO alignment (https://www.ensembl.org/info/genome/compara/mlss.html?mlss=1102) and state that there is a total of 329,294 blocks.

After downloading the entire alignment (838 files) in .maf format from (ftp://ftp.ensembl.org/pub/release-91/maf/ensembl-compara/multiple_alignments/epo_25_eutherian/) I unzipped them and utilized awk to count the number of alignment blocks by counting the number of rows starting with an "a"

for file in 25_eutherian_mammals_EPO.* ; do awk '$1=="a"{count++}END{print count}' $file >> alignment_block_counts.txt ; done ;
awk '{sum+=$1}END{print sum}' alignment_block_counts.txt ;

The total number of alignment blocks sums up to 165,214 which is around half of what is mentioned on the website, therefore I am a bit confused what this number consists of?

Best regards,

Christian Gross
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180212/c5f76337/attachment.html>


More information about the Dev mailing list