[ensembl-dev] list of SNPs with a phenotype, sql help

Laurent Gil lgil at ebi.ac.uk
Thu Oct 9 12:58:45 BST 2014


Hi Nathalie,

I won't recommand to use the "fetch_all" method, as it can be very slow 
and you will pick up Gene, QTL and structural variations as well.

If you want to retrieve all the variants with phenotype association, you 
should use the VariationSet 
<http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1VariationSet.html> 
object, as we have a dedicated variation set for the 
phenotype-associated variants ("All phenotype-associated variants"), e.g.:

my $vs_adaptor = $registry->->get_adaptor('human','variation','variationset');

my $vs = $vs_adaptor->fetch_by_name('All phenotype-associated variants');

my $iterator = $vs->get_Variation_Iterator();
while ($iterator->has_next()) {
   # Print the name of the variation
   my $var = $iterator->next();
   print "Variant: ".$var->name()."\n";
}

Here is the list of variation set available: 
http://www.ensembl.org/info/genome/variation/data_description.html#variation_sets
The example in the Ensembl Variation API tutorial will help you to 
retrieve the variant names (as described in the example above) and their 
associated phenotypes:
http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#variation_sets 
(i.e. the 3rd block of code of the section "Variation sets").
<http://www.ensembl.org/info/genome/variation/data_description.html#variation_sets>
For your last question, the variant ID is stored into the method 
"object_id 
<http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PhenotypeFeature.html#a888f8f444e6dab333724be8a5bcd5676>" 
in the PhenotypeFeature object. However you can get the Variation object 
from PhenotypeFeature, using the method "variation 
<http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PhenotypeFeature.html#a0fc55eb66ad71bbd5f647b225310dca1>".

Best regards,

Laurent

On 09/10/2014 12:13, Nathalie Conte wrote:
> hi Laurent
> Thanks for this -
> I want all SNP associated with a phenotype,not one in particular,  I 
> used diabetes as an example .
> is fetch_all the one to use?
> public List 
> Bio::EnsEMBL::Variation::DBSQL::PhenotypeFeatureAdaptor::fetch_all
>
> Another question, how do you get the SNPs IDs associated with those 
> phenotypes?
> Thanks
> Nathalie
>
> On 9 Oct 2014, at 11:44, Laurent Gil <lgil at ebi.ac.uk 
> <mailto:lgil at ebi.ac.uk>> wrote:
>
>> Hi Nathalie,
>>
>> The database schema is not easy to decipher. We built the following 
>> documentation to help to understand the tables/columns:
>> http://www.ensembl.org/info/docs/api/variation/variation_schema.html
>> At the moment, the variation_attrib table is only used by our 
>> colleagues in Ensembl Genomes, so you don't need to use this table to 
>> link the variations to the phenotype.
>>
>> However, I warmly recommand to use the Variation API: there is a 
>> method in the PhenotypeFeatureAdaptor object to retrieve all the data 
>> associated with a phenotype: " 
>> fetch_all_by_phenotype_description_source_name 
>> <http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1DBSQL_1_1PhenotypeFeatureAdaptor.html#a5692dfaa03010ea9eec3e4a279421496>" 
>> (the source name parameter is optional). And from the 
>> PhenotypeFeature objects, you can retrieve the variations.
>>
>> Best regards,
>> Laurent
>> On 09/10/2014 11:27, Nathalie Conte wrote:
>>> HI,
>>> I am trying to build a sql query in order to select all SNPs (germline and somatic, ie rs1223) associated with a phenotype.(ie diabetes) using sql command.
>>> I started building a sql query connecting different tables in homo_sapiens_variation_77_38 database in order to connect the variation table (variation_id, rs1223)and the phenotype table where the description of the phenotype is (ie diabetes).
>>> Looking at the schemahttp://www.ensembl.org/info/docs/api/variation/variation-database-schema.pdf  , I am linking variation table through variation, variation_attrib, phenotype_feature_attrib, feanotype_feature, and phenotype.
>>> It seems that variation_attri table is empty, so I cannot link the data
>>> mysql -hensembldb.ensembl.org  <http://hensembldb.ensembl.org>  -uanonymous -e"SELECT * FROM  variation_attrib   LIMIT 10;" homo_sapiens_variation_77_38 command will return no  result.
>>> Could you please help me forward suggestion another path to link variation.variation_id to phenotype.description, or suggest directly a sql command?
>>> Thanks in advance
>>> Nathalie
>>>
>>>
>>> _______________________________________________
>>> Dev mailing listDev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: 
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141009/a0709e12/attachment.html>


More information about the Dev mailing list