[ensembl-dev] VCF order chrX, chrM
Will McLaren
wm2 at ebi.ac.uk
Wed Apr 30 10:42:34 BST 2014
This was easier to fix than I thought it would be; I've pushed a fix to the
ensembl-variation GitHub repo, it's available on the release/75 branch.
Will
On 30 April 2014 10:32, mag <mr6 at ebi.ac.uk> wrote:
> Hi Will,
>
> Chromosomes in Ensembl have a 'karyotype_rank' attribute that gives the
> expected chromosome ordering (1-22, X, Y, MT)
>
> I don't know how applicable it is to VEP, but it might be something to
> bear in mind.
>
>
> Cheers,
> mag
>
>
> On 30/04/2014 09:18, Will McLaren wrote:
>
> Hi Guillermo,
>
> Currently the VEP internally sorts each buffer of 5000 variants that it
> reads in before writing the output. The sort is done alphanumerically, so
> it will order e.g. 1-22,M,X,Y.
>
> It looks like the buffer partially overlaps your input groups, such
> that, in your example, the first buffer read would be
>
> chrX variant1
> chrX variant2
>
> These are parsed, sorted and written out. Then the buffer reads in the
> next batch:
>
> chrX variant3
> chrX variant4
> chrM variant1
> chrM variant2
>
> which then get sorted to
>
> chrM variant1
> chrM variant2
> chrX variant3
> chrX variant4
>
> since M is before X alphabetically. So, I'm afraid this explains but
> doesn't fix your problem! You could ensure that your chrM variants appear
> before your chrX and chrY variants in the file, and this problem shouldn't
> appear.
>
> For the next VEP release I'll look into retaining the input sorting when
> using VCF as the output format as I think this would be preferable for most
> users.
>
> Regards
>
> Will McLaren
> Ensembl Variation
>
>
> On 30 April 2014 07:47, Guillermo Marco Puche <
> guillermo.marco at sistemasgenomicos.com> wrote:
>
>> Dear developers,
>>
>> I'm experiencing a strange behavior when annotating a full sorted VCF
>> file.
>> My chr order is the following: chr1 to chr22, chrX, chrY, chrM.
>>
>> I've noticed when I've variants in chrX then in chrM the vep scripts
>> annotates the full vcf file but it changes the order of some of the lines.
>> See example below:
>>
>> Imagine I've the following variants in my vcf:
>>
>> chrX variant1
>> chrX variant2
>> chrX variant3
>> chrX variant4
>> chrM variant1
>> chrM variant2
>>
>> After annotating the VCF the order remains like this:
>>
>> chrX variant1
>> chrX variant2
>> chrM variant1
>> chrM variant2
>> chrX variant3
>> chrX variant4
>>
>> This is just a graphical example. I would like to fix this, because it's
>> a bit tricky to get a non sorted VCF annotated file. I've not experienced
>> this issue with other chrX and chrM. Already tried to debug this disabling
>> all the plugins and the issue reproduces itself.
>>
>> Thank very much.
>>
>> Best regards,
>> Guillermo.
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140430/f41981e3/attachment.html>
More information about the Dev
mailing list