[ensembl-dev] VEP75 input

Emily Perry emily at ebi.ac.uk
Mon Mar 2 15:50:14 GMT 2015


Hi Eva

You'll want to put it in as:
8 133984815 133984814 -/TT +

For insertions, the VEP needs the two bases the insertion falls between, 
the highest number first. If you put a space larger than one base, it 
doesn't know where the insertion is.

All the best

Emily

On 02/03/2015 15:23, Eva Goncalves Serra wrote:
> Hi,
>
> I am trying to use vep75 (with cache) and had to re-format my input 
> (which was not in vcf/or other compatible formats) to the ensembl 
> input format. Thought I have done this successfully but I get an error 
> in a specific insertion:
>
> Original file entry:
> 8:133984814c.6056-29G>GTT
>
> Formatted to ensembl format:
> 8 133984816 133984814 G/GTT +
>
> Error I get:
> WARNING: start > end+1 : (START=133984816, END=133984814) on line 19.
>
> My code to reformat the input was this:
>
>       my @split = split(/\t/); # splitting file by tabs
>       my @al = split(':',$split[1]); # getting the chr:pos
>       my @al2 = split('>',$split[3]); # getting the ref>alt
>
>       if ((length $al2[0]==1) && (length $al2[1]==1)) {
>         print "$al[0] $al[1] $al[1] $al2[0]/$al2[1] +\n";
>       } elsif (length $al2[1] > length $al2[0]) {
>             my $sub = (length $al2[1])-(length $al2[0]);
>             my $new = $al[1]+$sub;
>             print "$al[0] $new $al[1] $al2[0]/$al2[1] +\n";
>       } elsif (length $al2[1] < length $al2[0]) {
>              my $sub2 = (length $al2[0])-(length $al2[1]);
>              my $new2 = $al[1]-$sub2;
>              print "$al[0] $new2 $al[1] $al2[0]/$al2[1] +\n";
>       }
>
> Am I missing something?
>
> Thanks a lot!
>
> Eva
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-- 
Dr Emily Perry (Pritchard)
Ensembl Outreach Project Leader

European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge
CB10 1SD
UK

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150302/0288557d/attachment.html>


More information about the Dev mailing list