[ensembl-dev] variation database usage

Will McLaren wm2 at ebi.ac.uk
Thu Feb 16 10:45:07 GMT 2012

Hi Hardip,

You may find it easier to use the VEP for this as it wraps up a lot of
the functionality you are interested in already. You could get it to
check for phenotypes by creating a bed file or similar of
phenotype-associated loci, tabix indexing it and using it as a custom
data source for the VEP (see

The VEP can also compare to existing variations, and their alleles,
using --check_existing and --check_alleles.

If you do want to continue with the API, here's some code that should
get you started - I'm assuming you have your VF object created in
$new_vf, and that you are connected to the database already.


Will McLaren
Ensembl Variation

# attach a slice to the VF, it probably doesn't have one
my $sa = $reg->get_adaptor("human","core","slice");
my $slice = $sa->fetch_by_region("chromosome", $new_vf->{chr});
$new_vf->{slice} = $slice;

# get overlapping existing VFs from the variation database by fetching
from the feature slice of the new VF
foreach my $existing_vf(@{$new_vf->feature_Slice->get_all_VariationFeatures}) {

  # compare alleles
  print "New alleles!\n" if $new_vf->allele_string ne

  # get phenotype annotations via the variation object
  foreach my $va(@{$existing_vf->variation->get_all_VariationAnnotations}) {
     print $existing_vf->variation_name, " is associated with
phenotype ", $va->phenotype_description, "\n";

On 16 February 2012 10:21, Hardip Patel <hardip.patel at anu.edu.au> wrote:
> Dear all
> I have vcf files generated for individual chromosomes from a human
> resequencing project. I was wondering if somebody could get me started with
> ways to use the variation api.
> I am mainly interested in knowing following from my vcf files.
> Is the variation in vcf is found in dbSNP and if yes, is it the same
> genotype as the one in my vcf file?
> Is the variation implicated in NHGRI_GWAS catalogue or not?
> I have tried reading documentation on variation api and i am not able to
> come up with a way to do this.
> I am able to use parse_vcf subroutine to parse vcf line and get a variation
> feature object. I am getting stuck after that in that I am not sure how to
> use the variationfeature to ask the above questions.
> Any help is greatly appreciated.
> Kind regards
> Hardip R. Patel, PhD
> Post-doctoral Research Fellow
> Genome Discovery Unit and RNA Biology Lab
> Genome Biology Department
> The John Curtin School of Medical Research
> College of Medicine, Biology and Environment
> The Australian National University
> Building 131, Garran Road, ANU Campus, Acton - 0200, ACT, Australia
> Email: hardip.patel at anu.edu.au, patelhardip at gmail.com
> Phone Number: (+61) 0449 180 715
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

More information about the Dev mailing list