[ensembl-dev] Ld scores from ensembl

Will McLaren wm2 at ebi.ac.uk
Thu Jun 18 11:06:30 BST 2015


Hi Nathalie,

It seems we have an issue with the HapMap genotypes in our release/80
database.

For 1000 genomes phase 3 data, this is available though it requires a
couple of extra steps to retrieve genotypes and LD data.

1) Install the tabix utility and Perl module

cd [somewhere/to/install/software]
git clone git at github.com:samtools/tabix.git
cd samtools
make
cd perl
perl Makefile.PL PREFIX=/somewhere/in/your/PERL5LIB
make && make install

2) Install the ensembl-io module

cd [path/where/ensembl/is]
git clone git at github.com:Ensembl/ensembl-io.git
[add ensembl-io/modules to your PERL5LIB]

You can then add the following line to your code before you retrieve the
LDFeatureContainer:

$ldFeatureContainerAdaptor->db->use_vcf(1);

Apologies, this is currently missing from our formal documentation. We will
get this fixed soon.

Regards

Will McLaren
Ensembl Variation


On 17 June 2015 at 15:53, Nathalie Conte <nconte at ebi.ac.uk> wrote:

> Hi
> I am trying to get LD scores from slice, I am using the tutorial script
> from ensembl
> http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#ld
>
> I retrieve data with 79, not with 80
> any idea of why?
> thanks
> Nathalie
>
> #!/usr/local/bin/perl
> use strict;
> use warnings;
> use Bio::EnsEMBL::Registry;
>
> my $registry = 'Bio::EnsEMBL::Registry';
>
> $registry->load_registry_from_db(
>     -host=>"ensembldb.ensembl.org", -user=>"anonymous",
>     -port=>'5306', 'db_version' => 80,);
> my $chr = 6;  #defining the region in chromosome 6
> my $start = 25_834_000;
> my $end = 25_854_000;
>
>
> my $population_name = 'CSHL-HAPMAP:HapMap-CEU'; #we only want LD in this
> population
> #my $population_name ='1000GENOMES:phase_3:CEU'; # I also tried with this
> population
>
> my $slice_adaptor = $registry->get_adaptor('human', 'core', 'slice'); #get
> adaptor for Slice object
> my $slice =
> $slice_adaptor->fetch_by_region('chromosome',$chr,$start,$end); #get slice
> of the region
>
> my $population_adaptor = $registry->get_adaptor('human', 'variation',
> 'population'); #get adaptor for Population object
> my $population = $population_adaptor->fetch_by_name($population_name);
> #get population object from database
>
> my $ldFeatureContainerAdaptor = $registry->get_adaptor('human',
> 'variation', 'ldfeaturecontainer'); #get adaptor for LDFeatureContainer
> object
> my $ldFeatureContainer =
> $ldFeatureContainerAdaptor->fetch_by_Slice($slice,$population); #retrieve
> all LD values in the region
>
>
> foreach my $r_square (@{$ldFeatureContainer->get_all_r_square_values}){
>   if ($r_square->{r2} > 0.8){ #only print high LD, where high is defined
> as r2 > 0.8
>     print "High LD between variations ",
> $r_square->{variation1}->variation_name,"-",
> $r_square->{variation2}->variation_name, "\n";
>   }
> }
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150618/2dbdf4d8/attachment.html>


More information about the Dev mailing list