[ensembl-dev] LD/population questions

Catherine Leroy cleroy at ebi.ac.uk
Mon Feb 23 09:55:45 GMT 2015


Don’t worry I’m not expecting Ensembl to have every data of every paper. :-)
I’ll look at the population in 1000 genomes. 
It will hopefully be close enough and very convenient for us as I’m not sure we could get those LD r2 in any other way. 
Thank you for your answer. 
Cheers,
Catherine



On 20 Feb 2015, at 17:06, Emily Perry <emily at ebi.ac.uk> wrote:

> Hi Catherine
> 
> When data goes into the GWAS catalogue, it is defined just by the region where the GWAS study was carried out, in this case Europe. In Ensembl, we have data from specific experiments, working with particular populations. We calculate the LD from the experiments we have in our databases. We can't claim to have all the variation and LD for everybody in Europe, only for the group studied, so we define our population names as the experiments. You can then look at the experiments themselves to see whether they are representative for your purpose or not.
> 
> Here's a sum up of the samples used in 1000 Genomes. 
> http://www.1000genomes.org/about#ProjectSamples
> 
> As you can see, the European (EUR) super-population is made up of British (GBR), Finnish (FIN), Spanish (IBS), Italian (TSI) and American (CEU) subpopulations. If you think this is representative for what you're doing then the 1000 Genomes EUR population is the one you want to use to get your LD. The name for this population in the database is 1000GENOMES:phase_1_EUR, telling us it comes from 1000 Genomes, it's the data from phase 1 and it's the European population (watch out for phase 3 with more variants coming soon). We don't currently have a more representative population for Europe as a whole.
> 
> All the best
> 
> Emily
> 
> On 20/02/2015 16:50, Catherine Leroy wrote:
>> I get my snp and ethnicity from the gwas catalog database. The specific example I gave earlier is pubmed : 23128233. So I won’t have much more then what I have put in my script. 
>> So there’s no such thing as r2 data for ‘European population’?
>> 
>>  
>> On 20 Feb 2015, at 14:36, Emily Perry <emily at ebi.ac.uk> wrote:
>> 
>>> Hi Catherine
>>> 
>>> You need to use the names that the populations are called in the database, so to get 1000 genomes Europeans you would use 1000GENOMES:phase_1_EUR.
>>> 
>>> All the best
>>> 
>>> Emily
>>> 
>>> On 20/02/2015 13:32, Catherine Leroy wrote:
>>>> Hello, 
>>>> 
>>>> I am trying to work my head around LD. As you might see from my questions it is all quite new to me.
>>>> 
>>>> I have the following snp : rs11010067
>>>> and I am trying to get the r2 value using the script I’ve copied past below.
>>>> 
>>>> If I enter as population : Europe 
>>>> then I don’t get anything back. 
>>>> 
>>>> If I enter as population : CSHL-HAPMAP:HapMap-CEU
>>>> Then I get back a list of variation1, variation2 and r2. 
>>>> 
>>>> I don’t understand why I don’t get anything with Europe. 
>>>> Could somebody explain that to me or point me to some documentation about LD I could read. 
>>>> 
>>>> I would want to use r2 to determine if a snp I have is in the same LD Block then the gene I’m supposing it is linked to using a threshold for r2 (which I haven’t determine yet, I’ve just started working on that). The problem is that the indication I have as population for this snp in my data (Gwas catalog) is Europe which doesn’t return anything. 
>>>> 
>>>> Thanks for your help,
>>>> Catherine
>>>> 
>>>> 
>>>> 
>>>> 
>>>> use strict;
>>>> use warnings;
>>>> use Bio::EnsEMBL::Registry;
>>>> 
>>>> 
>>>> my $registry = 'Bio::EnsEMBL::Registry';
>>>> 
>>>> $registry->load_registry_from_db(
>>>> -host   => 'ensembldb.ensembl.org',
>>>> -user   => 'anonymous',
>>>> );
>>>> 
>>>> my $rs_id = "rs11010067";
>>>> 
>>>> my $variation_adaptor = $registry->get_adaptor( 'human', 'variation', 'variation' );
>>>> my $variation = $variation_adaptor->fetch_by_name($rs_id);
>>>> 
>>>> my $population_adaptor = $registry->get_adaptor('human', 'variation', 'population'); #get adaptor for Population object
>>>> my $population = $population_adaptor->fetch_by_name("CSHL-HAPMAP:HapMap-CEU");
>>>> #my $population = $population_adaptor->fetch_by_name("Europe");
>>>> 
>>>> foreach my $variation_feature (@{$variation->get_all_VariationFeatures()}) {
>>>>     print $variation_feature->seq_region_name(),':', $variation_feature->seq_region_start(), '-', $variation_feature->seq_region_end(),"\n";
>>>> 
>>>>     my $ldFeatureContainerAdaptor = $registry->get_adaptor('human', 'variation', 'ldfeaturecontainer'); #get adaptor for LDFeatureContainer object
>>>>     my $ldFeatureContainer = $ldFeatureContainerAdaptor->fetch_by_VariationFeature($variation_feature, $population);
>>>> 
>>>>     my $r_square_values = $ldFeatureContainer->get_all_r_square_values();
>>>>     foreach my $r_square_value (@{$r_square_values}){
>>>> 
>>>>         my $variation_feature_1 = $r_square_value->{variation1};
>>>>         my $variation1_name = $variation_feature_1->name();
>>>>         #$variation1
>>>>         my $variation_feature_2 = $r_square_value->{variation2};
>>>>         my $variation2_name = $variation_feature_2->name();
>>>> 
>>>>         my $r2 = $r_square_value->{r2};
>>>> 
>>>>         print "variation1 = " , $variation1_name, "variation1_region_name = ", $variation1_name, " \n";
>>>>         print "variation2 = " , $variation2_name, "variation2_region_name = ", $variation2_name, " \n";
>>>>         print "r2 = " , $r2 , "\n\n";
>>>> 
>>>>     }
>>>> }
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>> 
>>> -- 
>>> Dr Emily Perry (Pritchard)
>>> Ensembl Outreach Officer
>>> 
>>> European Bioinformatics Institute (EMBL-EBI)
>>> European Molecular Biology Laboratory
>>> Wellcome Trust Genome Campus
>>> Hinxton
>>> Cambridge
>>> CB10 1SD
>>> UK 
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>> 
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> 
> -- 
> Dr Emily Perry (Pritchard)
> Ensembl Outreach Officer
> 
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge
> CB10 1SD
> UK 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150223/c885d83b/attachment.html>


More information about the Dev mailing list