[ensembl-dev] mapping of variations to variation features

Pontus Larsson Pontus.Larsson at ebi.ac.uk
Fri Feb 18 00:09:19 GMT 2011


Hi Andrea,

Q1:

We import the mappings of these variations from dbSNP. Normally dbSNP merge
variations that share flanking sequence (unless they are of different
classes) but that hasn't happened in this case. I'm not sure exactly why but
it could have something to do with one of them having a quite short flanking
sequence.

Q2:

If you create a new VariationFeature object, you should supply an associated
Variation object in the constructor. Since you don't seem to do that in your
code, you will just get an undef back. However, I think what you are trying
to do is to get VF objects for variations that we have in our database. Then
you shouldn't create new objects yourself but rather use the adaptor modules
which will create the objects from the database with all associated data
attached.
In your case, you should first create a slice that only covers the position
you are interested in and then get all VFs on this slice. Like so:

  my $slice = $slice_adaptor->fetch_by_region('1', 63268, 63268);
 mv @vfs = @{$variation_feature_adaptor->fetch_all_by_Slice($slice)};

If there are multiple VFs at the same location, the fetch method will return
all of them in a listref.

Best regards
/Pontus


2011/2/17 Andrea Edwards <edwardsa at cs.man.ac.uk>

> Question 1
> =========
>
> The following 2 human variations in dbSNP have different flanking sequences
> and are represented by different variation entries in the variation table.
> The flanking sequence of the first SNP is longer than the second where the
> overlapping regions of the flanking sequences are the same
> -rs75478250 (vid = 18636645)
> -rs28664618 (vid = 9542974)
>
> Both of these variations have variation features affecting the locus
> 1:63268
>
> How is it possible that the 2 variations  can map to the same genome
> location? I presume it is because both variations share a common length of
> flanking sequence and this common region maps to the same genomic DNA. What
> length of the flanking sequence of a variation must match the genomic
> sequence for the variation to be mapped to that genome location?
>
> Question 2
> ========
>
> I am have created a variation feature object for lots of SNPs and I am
> unable to get the underlying variation object. I get an error that the
> variation is undefined
>
> my $vf = Bio::EnsEMBL::Variation::VariationFeature->new(
>    -start => $pos,
>    -end => $pos,
>    -slice => $slice, # the variation must be attached to a slice
>    -allele_string => $allele, # the first allele should be the reference
> allele
>    -strand => 1,
>    -map_weight => 1,
>    -adaptor => $vfa, # we must attach a variation feature adap
>    -variation_name => 'newSNP',
>
>    );
>    $v = $vf->variation();
>    @synonyms = @{$v->get_all_synonyms()}; <====v is undefined
>
> I have tried this with the locus 1:63268 from above where 2 variations
> exist and I am unable to retrieve either of them. This raises this questions
> a) why can't i get the variation object?
> b) what happens where there is more than one variation feature entry in the
> database for the locus on which the variation feature object is created?
> Which variation feature's variation object would you return as the method
> does not return an array? E.g what variation would be returned for a
> variation feature created on locus 1:63268?
>
>
> thanks very much
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110218/7cb2bcc1/attachment.html>


More information about the Dev mailing list