[ensembl-dev] LD/population questions

Emily Perry emily at ebi.ac.uk
Fri Feb 20 17:06:33 GMT 2015


Hi Catherine

When data goes into the GWAS catalogue, it is defined just by the region 
where the GWAS study was carried out, in this case Europe. In Ensembl, 
we have data from specific experiments, working with particular 
populations. We calculate the LD from the experiments we have in our 
databases. We can't claim to have all the variation and LD for everybody 
in Europe, only for the group studied, so we define our population names 
as the experiments. You can then look at the experiments themselves to 
see whether they are representative for your purpose or not.

Here's a sum up of the samples used in 1000 Genomes.
http://www.1000genomes.org/about#ProjectSamples

As you can see, the European (EUR) super-population is made up of 
British (GBR), Finnish (FIN), Spanish (IBS), Italian (TSI) and American 
(CEU) subpopulations. If you think this is representative for what 
you're doing then the 1000 Genomes EUR population is the one you want to 
use to get your LD. The name for this population in the database is 
1000GENOMES:phase_1_EUR, telling us it comes from 1000 Genomes, it's the 
data from phase 1 and it's the European population (watch out for phase 
3 with more variants coming soon). We don't currently have a more 
representative population for Europe as a whole.

All the best

Emily

On 20/02/2015 16:50, Catherine Leroy wrote:
> I get my snp and ethnicity from the gwas catalog database. The 
> specific example I gave earlier is pubmed : /23128233. So I won’t have 
> much more then what I have put in my script. /
> /So there’s no such thing as r2 data for ‘European population’?
> /
>
>
> On 20 Feb 2015, at 14:36, Emily Perry <emily at ebi.ac.uk 
> <mailto:emily at ebi.ac.uk>> wrote:
>
>> Hi Catherine
>>
>> You need to use the names that the populations are called in the 
>> database, so to get 1000 genomes Europeans you would use 
>> 1000GENOMES:phase_1_EUR.
>>
>> All the best
>>
>> Emily
>>
>> On 20/02/2015 13:32, Catherine Leroy wrote:
>>> Hello,
>>>
>>> I am trying to work my head around LD. As you might see from my 
>>> questions it is all quite new to me.
>>>
>>> I have the following snp : rs11010067
>>> and I am trying to get the r2 value using the script I’ve copied 
>>> past below.
>>>
>>> If I enter as population : Europe
>>> then I don’t get anything back.
>>>
>>> If I enter as population : CSHL-HAPMAP:HapMap-CEU
>>> Then I get back a list of variation1, variation2 and r2.
>>>
>>> I don’t understand why I don’t get anything with Europe.
>>> Could somebody explain that to me or point me to some documentation 
>>> about LD I could read.
>>>
>>> I would want to use r2 to determine if a snp I have is in the same 
>>> LD Block then the gene I’m supposing it is linked to using a 
>>> threshold for r2 (which I haven’t determine yet, I’ve just started 
>>> working on that). The problem is that the indication I have as 
>>> population for this snp in my data (Gwas catalog) is Europe which 
>>> doesn’t return anything.
>>>
>>> Thanks for your help,
>>> Catherine
>>>
>>>
>>>
>>>
>>> use strict;
>>> use warnings;
>>> use Bio::EnsEMBL::Registry;
>>>
>>>
>>> my $registry = 'Bio::EnsEMBL::Registry';
>>>
>>> $registry->load_registry_from_db(
>>> -host   => 'ensembldb.ensembl.org  <http://ensembldb.ensembl.org/>',
>>> -user   => 'anonymous',
>>> );
>>>
>>> my $rs_id = "rs11010067";
>>>
>>> my $variation_adaptor = $registry->get_adaptor( 'human', 'variation', 'variation' );
>>> my $variation = $variation_adaptor->fetch_by_name($rs_id);
>>>
>>> my $population_adaptor = $registry->get_adaptor('human', 'variation', 'population');#get adaptor for Population object
>>> my $population = $population_adaptor->fetch_by_name("CSHL-HAPMAP:HapMap-CEU");
>>> #my $population = $population_adaptor->fetch_by_name("Europe");
>>>
>>> foreach my $variation_feature (@{$variation->get_all_VariationFeatures()}) {
>>>      print $variation_feature->seq_region_name(),':', $variation_feature->seq_region_start(), '-', $variation_feature->seq_region_end(),"\n";
>>>
>>>      my $ldFeatureContainerAdaptor = $registry->get_adaptor('human', 'variation', 'ldfeaturecontainer');#get adaptor for LDFeatureContainer object
>>>      my $ldFeatureContainer = $ldFeatureContainerAdaptor->fetch_by_VariationFeature($variation_feature, $population);
>>>
>>>      my $r_square_values = $ldFeatureContainer->get_all_r_square_values();
>>>      foreach my $r_square_value (@{$r_square_values}){
>>>
>>>          my $variation_feature_1 = $r_square_value->{variation1};
>>>          my $variation1_name = $variation_feature_1->name();
>>>          #$variation1
>>>          my $variation_feature_2 = $r_square_value->{variation2};
>>>          my $variation2_name = $variation_feature_2->name();
>>>
>>>          my $r2 = $r_square_value->{r2};
>>>
>>>          print "variation1 = " , $variation1_name, "variation1_region_name = ", $variation1_name, " \n";
>>>          print "variation2 = " , $variation2_name, "variation2_region_name = ", $variation2_name, " \n";
>>>          print "r2 = " , $r2 , "\n\n";
>>>
>>>      }
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing listDev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>>
>> -- 
>> Dr Emily Perry (Pritchard)
>> Ensembl Outreach Officer
>>
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus
>> Hinxton
>> Cambridge
>> CB10 1SD
>> UK
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: 
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-- 
Dr Emily Perry (Pritchard)
Ensembl Outreach Officer

European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge
CB10 1SD
UK

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150220/f7ea4efd/attachment.html>


More information about the Dev mailing list