[ensembl-dev] Variants seq_region_name excluding patches / seq_region_start for VCF

Will McLaren wm2 at ebi.ac.uk
Thu Dec 18 10:12:17 GMT 2014


Hi G,

1. You can check if the slice the VariationFeature falls on is a reference
slice:

while(my $vf = shift@{$vfeatures}) {
  next unless $vf->slice->is_reference();
  print $vf->seq_region_name(), "\t", $vf->seq_region_start(), "\n";
}

2. You need to understand a little the difference between how Ensembl and
VCF store coordinates for unbalanced substitutions (e.g. insertions or
deletions). See
http://www.ensembl.org/info/docs/tools/vep/vep_formats.html#vcf

So typically you should be able to get the VCF position by subtracting 1
from the seq_region_start() as this accounts for the additional base
included in the VCF representation.

HTH

Will McLaren
Ensembl Variation

On 17 December 2014 at 18:03, Genomeo Dev <genomeodev at gmail.com> wrote:
>
> Hi,
>
> *Using Ensembl 75 VM*
>
> I am trying to retrieve coordinates and sequence regions for some variants
> using variant IDs:
>
> my $var = $var_adaptor->fetch_by_name('rs7904594');
> my $vfeatures = $var->get_all_VariationFeatures();
> while (my $vf = shift @{$vfeatures}){
> print $vf->seq_region_name(), "\t", $vf->seq_region_start(), "\n";
> }
>
> *Questions:*
>
> 1. Is there a way I can choose to print main chromosomes only excluding
> patches?
> 2. I want to use the position information in a VCF. seq_region_start only
> works for SNPs. How can I make use of that for indels?
>
> Thanks!
>
> --
> G.
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141218/c1985028/attachment.html>


More information about the Dev mailing list