[ensembl-dev] Missense SNP's and frequencies

Will McLaren wm2 at ebi.ac.uk
Mon Mar 11 12:00:36 GMT 2013


Hi Jens,

You can use fetch_all_by_Slice_SO_terms to get only variations with a
particular consequence type, in this case the type you want to limit
to is "missense_variant":

my @vfs = @{$vfa-> fetch_all_by_Slice_SO_terms($slice, ['missense_variant'])};

To retrieve the global minor allele frequency (this is from the
combined 1000 genomes phase 1 data), you can use the method
minor_allele_frequency and minor_allele:

print $vf->minor_allele, ':', $vf->minor_allele_frequency, "\n";

You can get frequencies from many other populations by retrieving the
allele objects associated with the variation ($vf->get_all_Alleles).
See the documentation and tutorial pages for more details:

http://www.ensembl.org/info/docs/api/variation/index.html

Regards

Will McLaren
Ensembl Variation

On 11 March 2013 11:52, Jens Christian Nielsen <jcfnielsen at gmail.com> wrote:
> For a list of genbank accession numbers i wanna extract all missense
> variations and their frequencies. Right now my script extracts all snp's
> from the slice ($slice), but how can I restrict it to only print the snp's
> that lead to a change in the protein sequence? Also, i want it to return the
> frequencies of the snp's?
>
> use Bio::EnsEMBL::Registry;
> my $reg = 'Bio::EnsEMBL::Registry';
> $reg->load_registry_from_db(-host => 'ensembldb.ensembl.org', -user =>
> 'anonymous');
> my $gene_name = shift;
> my $ga = $reg->get_adaptor('Human', 'Core', 'Gene');
> my $sa = $reg->get_adaptor('Human', 'Core', 'Slice');
> my $vfa = $reg->get_adaptor('Human', 'Variation', 'VariationFeature');
>
> my $genes = $ga->fetch_all_by_external_name($gene_name);
> while (my $gene = shift @{$genes}) {
>   my $chr   = $gene->seq_region_name;
>   my $start = $gene->seq_region_start;
>   my $end   = $gene->seq_region_end;
>   my $region = sprintf "%s:%d-%d", $chr, $gene->start, $gene->end;
>   print join("\t", ($gene->stable_id, $region, $length,
> $gene->external_name, $gene->description) ), "\n";
>   my $slice = $sa->fetch_by_region('chromosome', $chr, $start, $end);
>   my @vfs = @{$vfa->fetch_all_by_Slice($slice)};
>   for my $vf (@vfs) {
>     print
>       $vf->variation_name, ' has alleles ', $vf->allele_string,
>       ' located at ', $slice->seq_region_name, ':',
>       $vf->seq_region_start, '-', $vf->seq_region_end, "\n";
>   }
> }
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>




More information about the Dev mailing list