[ensembl-dev] File format for Ensembl API

Will McLaren wm2 at ebi.ac.uk
Mon Oct 31 15:10:56 GMT 2011


Hi Salih,

I assume you are referring to the VEP, rather than the API?

This is a little tricky to decipher, as it is not clear how the
alleles are separated - I'm assuming each line can only represent a
single-base mutation, and the alleles are listed without a separator?
If you are using some software that produces this output, you might
like to check that it can't produce something better (VCF or pileup)
instead.

Anyway, assuming the above, you could have a perl one-liner something
like the following:

perl -e 'while(<>) { @data = split; ($chr, $pos) = split /\:/,
$data[0]; $alleles = join "/", split //, $data[1]; print
"$chr\t$pos\t$pos\t$alleles\t1\n"}'

Will

On 29 October 2011 16:59, Salih Tuna <st5 at sanger.ac.uk> wrote:
> Hi,
> I have a snplist in the following format.
>
> chr:position alleles gene type rs# base_chg annot dist pop_nr_freq f_nr_case
> f_nr_con chisqstat_assoc lrt_stat f+ f- SLOD p_val
> chr1:143663633 GC NA NA NA NA NA NA 0.021 0.0172 0.0248 0.51 0.522 0.02347
> 0.0005306 3.88 0.262649660469        3
> chr1:143663637 TG NA NA NA NA NA NA 0.0097 0.0099 0.0096    0 0.002866
> 0.0106 0.0019 -0.47 1.0  0
>
> Is there a way I can convert this to a correct file format to use in Ensembl
> API? I tried the —format guess option but it did not like it.
>
> Best,
> Salih
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>




More information about the Dev mailing list