[ensembl-dev] Automate the SNP variant result from "population genetics"

deepak kumar deepak.k.choubey at gmail.com
Sat Sep 23 14:17:40 BST 2017


Thanks much Anja!

I think Ensembl is a very useful platform for such queries. Am curious for
this following query below, could you please let me know how can I do this
using the Ensembl platform:

a) I want to extract the 5' and 3' UTRs from the mRNA of BRCA1 and BRCA2.
For instance information 5' & 3' UTR for the refseq geneid "NM_007299.3" of
BRCA1


b) Also, find the position of the SNPs (rsIDs) in the 5' and 3'  UTRs. For
instance information like this: (for the refseq geneid "NM_007299.3" of
BRCA1)

refseg-geneID                                    mutant-allele
                          position-of-mutation
NM_007299.3                 c to t                       400032

Thanks much! Please let me know if something is not clear.


On Thu, Sep 21, 2017 at 1:07 PM, Anja Thormann <anja at ebi.ac.uk> wrote:

> Hi DK,
>
> Let me give you some background on where we get our data for allele
> frequency or genotype frequency annotations from:
>
> We provide allele frequencies and genotype frequencies (where available)
> from a set of reference populations provided by projects like 1000 Genomes
> Project, ESP, gnomAD (supersedes ExAC).
>
> Only 1000 Genomes provides sample genotypes from which we can compute
> population genotype frequencies.
>
> ESP provides population genotype frequencies. But we don't get a break
> down of genotypes by sample in the population.
>
> gnomAD only provides allele counts in a population.
>
> Here is a variant which has annotations from all of the above projects:
> http://www.ensembl.org/Homo_sapiens/Variation/Population?
> db=core;r=1:230709548-230710548;v=rs699;vdb=variation;vf=664
>
> For 1000GENOMES:phase_3:AFR allele frequencies: A: 0.097 G: 0.903 genotype
> frequencies: A|A: 0.126 A|G: 0.338 G|G: 0.536
> For gnomADe:AFR allele frequencies: A: 0.152 G: 0.848 No genotype
> frequencies
>
> Our variation endpoint does not return gnomAD frequencies at the moment.
> We will include the frequencies for the next release.
>
> For now I would recommend that you use our VEP endpoint
> https://rest.ensembl.org/documentation/info/vep_id_get
> Examples:
> https://rest.ensembl.org/vep/human/id/rs769971095?content-
> type=application/json
> https://rest.ensembl.org/vep/human/id/rs699?content-type=application/json
>
> The VEP makes use of cache files which store allele frequencies for the
> 1000 Genomes Project super populations (AFR, AMR, EAS, EUR, SAS) and the
> gnomAD exome data.
>
> Please find a list of our populations, their short names and descriptions
> here:
> http://www.ensembl.org/info/genome/variation/data_
> description.html#populations
>
> The VEP provides annotations from:
> gnomADe:ALL - All gnomAD exomes individuals
> gnomADe:AFR - African/African American
> gnomADe:AMR - Admixed American
> gnomADe:ASJ - Ashkenazi Jewish
> gnomADe:EAS - East Asian
> gnomADe:FIN - Finnish
> gnomADe:NFE - Non-Finnish European
> gnomADe:OTH - Other
> gnomADe:SAS - South Asian
> 1000GENOMES:phase_3:AFR African
> 1000GENOMES:phase_3:AMR American
> 1000GENOMES:phase_3:EAS East Asian
> 1000GENOMES:phase_3:EUR European
> 1000GENOMES:phase_3:SAS South Asian
>
> The populations use the following names in the vep endpoints:
>   - for example for gnomADe:NFE:  gnomad_nfe_maf and  gnomad_nfe_allele
>   - for example for 1000GENOMES:phase_3:AFR: afr_maf and afr_allele
>
> We have a post vep endpoint which allows you to send a list of variant IDs
> for annotation. https://rest.ensembl.org/documentation/info/vep_id_post
>
> I hope that helps you with your use case.
>
> Anja
>
>
> On 20 Sep 2017, at 22:05, deepak kumar <deepak.k.choubey at gmail.com> wrote:
>
> Hi Anja,
>
> Thank you so much for the reply. It certainly helped me to get to the
> right direction of my query. However, could you please help me understand a
> few queries regarding the same:
>
> To start of with, I find the "Rest API" a very clean approach to get
> variant information.
>
> a) My aim is to find if a SNP (rsID let say rs769971095) share
> populations, or in other words, if this rsID mutation can be found in more
> than one population. From the links you provided I see that I can find an
> answer but am confused between "population allele frequency" and
> "population genotype frequency". To fulfill my aim, data for this rsID
> should be taken from "population allele frequency" or "population genotype
> frequency"?
>
> b) The population name given in the "example output" of the "Rest API" are
> in short form like 'AMR', 'SAS' etc. Could you please let me know how can i
> retrieve the full population name for a given rsID?
>
> Thanks much!
> DK
>
> On Tue, Sep 19, 2017 at 7:22 PM, Anja Thormann <anja at ebi.ac.uk> wrote:
>
>> Hi DK,
>>
>> you have a few options of getting allele frequencies for a variant.
>>
>> You can use
>>     - our perl API: http://www.ensembl.org/in
>> fo/docs/api/variation/variation_tutorial.html#alleles (to get you
>> started)
>>     - our REST API: https://rest.ensembl.org/
>> documentation/info/variation_id (to get you started)
>>     - the VEP: https://www.ensembl.org/info/docs/tools/vep/script/vep_
>> options.html It will allow you to annotate your input variants with
>> frequency data if available
>>
>> Please feel free to contact us again if you have any questions regarding
>> the above approaches.
>>
>> Kind regards,
>> Anja
>>
>>
>> On 19 Sep 2017, at 16:06, deepak kumar <deepak.k.choubey at gmail.com>
>> wrote:
>>
>> Dear ALL,
>>
>>  I have been looking for a way to find "which nsSNP (with rs ID number
>> like rs769971095) belong to what population(s), and if possible what
>> gender"? I came to know about the Ensembl "population genetics" for the
>> variants.
>>
>> I found the respective population genetics info for 2 rsIDs; rs559632360
>> & rs769971095
>>
>> For "rs769971095" the super-population it shows is: ALL, AFR, AMR, ASJ,
>> EAS, FIN, NFE, OTH, SAS.
>>
>> For "rs559632360" the super-population it shows is: ALL, AFR, AMR, EAS,
>> SAS, EUR.
>>
>>
>> For rs559632360 rsID, it also shows population genetics from "1000
>> Genomes Project Phase 3 & gnomAD exomes" along with "subpopulation"
>> information, whereas, for rs769971095 it shows only "gnomAD exomes"
>> population genetics.
>>
>> http://grch37.ensembl.org/Homo_sapiens/Variation/Population?
>> db=core;r=3:12625875-12626875;v=rs769971095;vdb=variation;vf=135759093
>>
>> http://grch37.ensembl.org/Homo_sapiens/Variation/Population?
>> db=core;r=3:12632759-12633759;v=rs559632360;vdb=variation;
>> vf=92299087#population_freq_SAS
>>
>> Does this mean that for "rs769971095" there is no "1000 genomes project
>> phase 3" data available?
>>
>> I am interested to know if these two rsIDs belong to one population, so,
>> can it be said that these rsIDs share same population? If yes, what
>> population they share? It would be great if I could know how to make a
>> reasonable interpretation for this.
>>
>> Also, I need to do this for many rsIDs, could you please let me know how
>> this process can be automated? Where, I can generate results like this:
>>
>>
>> *rsID                  Super-Population with allele frequencies
>>  Sub-population*
>>
>> rs769971095     ALL, AFR, AMR, ASJ, EAS, FIN, NFE, OTH, SAS.
>>  .......etc
>>
>> rs559632360      ALL, AFR, AMR, EAS, SAS, EUR
>>    ......etc
>>
>>
>>
>> Thanks much! DK
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170923/d914e446/attachment.html>


More information about the Dev mailing list