[ensembl-dev] discrepancies in number of alignmentblocks between website and files, 25 eutherian mammals EPO
Christian Groß - EWI
C.Gross at tudelft.nl
Mon Feb 12 14:55:49 GMT 2018
Dear Dev Team,
I am writing you because of a discrepancy between the number of alignment blocks stated on your website and the number of alignment blocks which I can find in the associated files.
On your website you list the details for the 25 eutherian mammals EPO alignment (https://www.ensembl.org/info/genome/compara/mlss.html?mlss=1102) and state that there is a total of 329,294 blocks.
After downloading the entire alignment (838 files) in .maf format from (ftp://ftp.ensembl.org/pub/release-91/maf/ensembl-compara/multiple_alignments/epo_25_eutherian/) I unzipped them and utilized awk to count the number of alignment blocks by counting the number of rows starting with an "a"
for file in 25_eutherian_mammals_EPO.* ; do awk '$1=="a"{count++}END{print count}' $file >> alignment_block_counts.txt ; done ;
awk '{sum+=$1}END{print sum}' alignment_block_counts.txt ;
The total number of alignment blocks sums up to 165,214 which is around half of what is mentioned on the website, therefore I am a bit confused what this number consists of?
Best regards,
Christian Gross
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180212/c5f76337/attachment.html>
More information about the Dev
mailing list