From ms30 at ebi.ac.uk Mon Mar 1 14:12:04 2021 From: ms30 at ebi.ac.uk (Michal Szpak) Date: Mon, 01 Mar 2021 14:12:04 +0000 Subject: [ensembl-dev] Free Ensembl Browser and REST API virtual workshops in March Message-ID: <7ba6e647b41a2f2e72f1917855a1009c@ebi.ac.uk> Hello, We?ve got another round of the free virtual Ensembl workshops covering the genome browser and the REST API. The Browser workshop will be held between Tuesday 16th March ? Thursday 18th March 2021 (9am-1pm GMT) and the REST API workshop will be held between Tuesday 23rd March ? Thursday 25th March 2021 (9am-11:30am GMT). Both courses include interactive exercise sessions, providing opportunities to ask questions and discuss them with both the instructors and the rest of the participants. For more information, please visit our blog post: https://www.ensembl.info/2021/02/26/free-ensembl-browser-and-rest-api-virtual-workshops-in-march/ You can find the registration form here: https://forms.gle/FtbYuEdAsqp9NH4q6 Best wishes, Michal -- Micha? Szpak, Ph.D. Ensembl Outreach Officer European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Genome Campus Hinxton, Cambridge, CB10 1SD, UK From ms30 at ebi.ac.uk Tue Mar 9 12:08:21 2021 From: ms30 at ebi.ac.uk (Michal Szpak) Date: Tue, 09 Mar 2021 12:08:21 +0000 Subject: [ensembl-dev] Jobs at Ensembl Message-ID: <63A800B6-2320-426F-9020-0761295D3820@ebi.ac.uk> Hello dev-ers, We have four jobs open at the moment, with the Genome Annotator application closing tonight: Job: Genome Annotator We?re seeking a genome annotator to investigate cell-type and single-cell specific isoform expression in human and model organisms. We?re looking for a BSc, MSc or equivalent experience in Molecular or Cell Biology, Genetics, Genomics or related fields. Closes 9th March. https://www.ensembl.info/2021/02/09/job-genome-annotator-2/ Job: Ensembl Metazoan Bioinformatician We?re seeking a bioinformatician to develop software and scale-up processes enabling analysis of thousands of genomes. We?re looking for a MSc/PhD in computational biology, bioinformatics or equivalent work experience, proficiency in dynamic programming language and relational databases and experience of working with large scale genomic data. Closes 11th March. https://www.ensembl.info/2021/02/09/job-ensembl-metazoan-bioinformatician/ Job: Production Software Developer We?re seeking a Python Software developer to help implement the next-generation of genome-scale Ensembl processing automation. We?re looking for a degree in computer science, bioinformatics, biological sciences or equivalent experience and a minimum of two years of professional programming experience. Closes 31st March. https://www.ensembl.info/2021/03/08/job-production-software-developer/ Job: Ensembl Plants Project Leader We?re seeking a Project Leader to help develop and manage the Ensembl Plants resource. We?re looking for a BSc or higher degree in computer science, bioinformatics, biological sciences or equivalent professional qualifications and experience in plant genomics, delivering production quality software and computational expertise. Closes 13th April. https://www.ensembl.info/2021/03/08/job-ensembl-plants-project-leader/ Please follow the instructions in the ad to apply. All the best, Michal -- Micha? Szpak, Ph.D. Ensembl Outreach Officer European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Genome Campus Hinxton, Cambridge, CB10 1SD, UK -------------- next part -------------- An HTML attachment was scrubbed... URL: From julie.sullivan at gmail.com Wed Mar 10 09:08:32 2021 From: julie.sullivan at gmail.com (Julie Sullivan) Date: Wed, 10 Mar 2021 09:08:32 +0000 Subject: [ensembl-dev] translation question GTG --> Methionine? Message-ID: https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000288649;r=20:33667144-33668235;t=ENST00000678634 The first codon is GTG. I would not have expected that to be Methionine. I looked in the text files, and there are 123 of these transcripts where the start codon is NOT ATG but the aa is M, in Homo sapiens. {'error': 0, 'methionine': 91434, 'GTG': 22, 'ATA': 10, 'CTG': 67, 'ACG': 8, 'TTG': 9, 'ATT': 5, 'AAC': 1, 'AAG': 1} Why is that? Specifically I would like a rule I can use, as my HGVSp strings are different from VEP for this reason. Thanks! Julie -------------- next part -------------- An HTML attachment was scrubbed... URL: From emily at ebi.ac.uk Wed Mar 10 10:08:24 2021 From: emily at ebi.ac.uk (Emily Perry) Date: Wed, 10 Mar 2021 10:08:24 +0000 Subject: [ensembl-dev] translation question GTG --> Methionine? In-Reply-To: <20210310091026.739B0118034_488D02B@hh-mx4.ebi.ac.uk> References: <20210310091026.739B0118034_488D02B@hh-mx4.ebi.ac.uk> Message-ID: <5FF48C1B-58B2-4301-B010-B3A4186A22ED@ebi.ac.uk> Hi Julie We have some information about non-ATG start codons in our blog post from release 102: https://www.ensembl.info/2020/11/30/ensembl-102-has-been-released/ Quite simply, there is not a rule. This is a situation of exceptional biology which we are only able to annotate correctly because of our expert manual gene annotators analysing the data in detail. All the best Emily > On 10 Mar 2021, at 09:08, Julie Sullivan wrote: > > https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000288649;r=20:33667144-33668235;t=ENST00000678634 > The first codon is GTG. I would not have expected that to be Methionine. > > I looked in the text files, and there are 123 of these transcripts where the start codon is NOT ATG but the aa is M, in Homo sapiens. > {'error': 0, > 'methionine': 91434, > 'GTG': 22, > 'ATA': 10, > 'CTG': 67, > 'ACG': 8, > 'TTG': 9, > 'ATT': 5, > 'AAC': 1, > 'AAG': 1} > > Why is that? > > Specifically I would like a rule I can use, as my HGVSp strings are different from VEP for this reason. > > Thanks! > Julie > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/ ? Dr Emily Perry (Pritchard) Ensembl Outreach Project Leader (she/her) European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Genome Campus Hinxton Cambridge CB10 1SD UK -------------- next part -------------- An HTML attachment was scrubbed... URL: From julie.sullivan at gmail.com Wed Mar 10 14:37:43 2021 From: julie.sullivan at gmail.com (Julie Sullivan) Date: Wed, 10 Mar 2021 14:37:43 +0000 Subject: [ensembl-dev] translation question GTG --> Methionine? In-Reply-To: <5FF48C1B-58B2-4301-B010-B3A4186A22ED@ebi.ac.uk> References: <20210310091026.739B0118034_488D02B@hh-mx4.ebi.ac.uk> <5FF48C1B-58B2-4301-B010-B3A4186A22ED@ebi.ac.uk> Message-ID: Thank you! That answers my question! I would really like to be able to access that tag (non-ATG start) programmatically. Are there plans for putting it with the other transcript flags in the GTF file? On Wed, 10 Mar 2021 at 10:09, Emily Perry wrote: > Hi Julie > > We have some information about non-ATG start codons in our blog post from > release 102: > https://www.ensembl.info/2020/11/30/ensembl-102-has-been-released/ > > Quite simply, there is not a rule. This is a situation of exceptional > biology which we are only able to annotate correctly because of our expert > manual gene annotators analysing the data in detail. > > All the best > > Emily > > On 10 Mar 2021, at 09:08, Julie Sullivan wrote: > > > https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000288649;r=20:33667144-33668235;t=ENST00000678634 > The first codon is GTG. I would not have expected that to be Methionine. > > I looked in the text files, and there are 123 of these transcripts where > the start codon is NOT ATG but the aa is M, in Homo sapiens. > > {'error': 0, > 'methionine': 91434, > 'GTG': 22, > 'ATA': 10, > 'CTG': 67, > 'ACG': 8, > 'TTG': 9, > 'ATT': 5, > 'AAC': 1, > 'AAG': 1} > > > Why is that? > > Specifically I would like a rule I can use, as my HGVSp strings are > different from VEP for this reason. > > Thanks! > Julie > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: > https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/ > > > ? > > Dr Emily Perry (Pritchard) > Ensembl Outreach Project Leader > (she/her) > > European Bioinformatics Institute (EMBL-EBI) > European Molecular Biology Laboratory > Wellcome Genome Campus > Hinxton > Cambridge > CB10 1SD > UK > > > > > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: > https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emily at ebi.ac.uk Wed Mar 10 15:58:00 2021 From: emily at ebi.ac.uk (Emily Perry) Date: Wed, 10 Mar 2021 15:58:00 +0000 Subject: [ensembl-dev] translation question GTG --> Methionine? In-Reply-To: <20210310143922.DD451118037_48DA1AB@hh-mx4.ebi.ac.uk> References: <20210310091026.739B0118034_488D02B@hh-mx4.ebi.ac.uk> <5FF48C1B-58B2-4301-B010-B3A4186A22ED@ebi.ac.uk> <20210310143922.DD451118037_48DA1AB@hh-mx4.ebi.ac.uk> Message-ID: <9729AFDF-083F-4B9D-9FF9-2910F4843184@ebi.ac.uk> Hi Julie The information is stored in the transcript_attrib table under attribution_id 380: http://www.ensembl.org/info/docs/api/core/core_schema.html#transcript_attrib You can fetch it from the Perl API using $transcript->get_all_Attributes http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Transcript.html#a59f9ff2079a28ba80bc8b62a5e636327 All the best Emily > On 10 Mar 2021, at 14:37, Julie Sullivan wrote: > > Thank you! That answers my question! > > I would really like to be able to access that tag (non-ATG start) programmatically. Are there plans for putting it with the other transcript flags in the GTF file? > > On Wed, 10 Mar 2021 at 10:09, Emily Perry > wrote: > Hi Julie > > We have some information about non-ATG start codons in our blog post from release 102: > https://www.ensembl.info/2020/11/30/ensembl-102-has-been-released/ > > Quite simply, there is not a rule. This is a situation of exceptional biology which we are only able to annotate correctly because of our expert manual gene annotators analysing the data in detail. > > All the best > > Emily > >> On 10 Mar 2021, at 09:08, Julie Sullivan > wrote: >> >> https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000288649;r=20:33667144-33668235;t=ENST00000678634 >> The first codon is GTG. I would not have expected that to be Methionine. >> >> I looked in the text files, and there are 123 of these transcripts where the start codon is NOT ATG but the aa is M, in Homo sapiens. >> {'error': 0, >> 'methionine': 91434, >> 'GTG': 22, >> 'ATA': 10, >> 'CTG': 67, >> 'ACG': 8, >> 'TTG': 9, >> 'ATT': 5, >> 'AAC': 1, >> 'AAG': 1} >> >> Why is that? >> >> Specifically I would like a rule I can use, as my HGVSp strings are different from VEP for this reason. >> >> Thanks! >> Julie >> _______________________________________________ >> Dev mailing list Dev at ensembl.org >> Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org >> Ensembl Blog: http://www.ensembl.info/ > > ? > > Dr Emily Perry (Pritchard) > Ensembl Outreach Project Leader > (she/her) > > European Bioinformatics Institute (EMBL-EBI) > European Molecular Biology Laboratory > Wellcome Genome Campus > Hinxton > Cambridge > CB10 1SD > UK > > > > > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/ > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/ ? Dr Emily Perry (Pritchard) Ensembl Outreach Project Leader (she/her) European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Genome Campus Hinxton Cambridge CB10 1SD UK -------------- next part -------------- An HTML attachment was scrubbed... URL: From hpages.on.github at gmail.com Wed Mar 10 23:06:59 2021 From: hpages.on.github at gmail.com (=?UTF-8?B?SGVydsOpIFBhZ8Oocw==?=) Date: Wed, 10 Mar 2021 15:06:59 -0800 Subject: [ensembl-dev] translation question GTG --> Methionine? In-Reply-To: <9729AFDF-083F-4B9D-9FF9-2910F4843184@ebi.ac.uk> References: <20210310091026.739B0118034_488D02B@hh-mx4.ebi.ac.uk> <5FF48C1B-58B2-4301-B010-B3A4186A22ED@ebi.ac.uk> <20210310143922.DD451118037_48DA1AB@hh-mx4.ebi.ac.uk> <9729AFDF-083F-4B9D-9FF9-2910F4843184@ebi.ac.uk> Message-ID: <87c5bac1-05b7-990a-a760-ece76dfd113d@gmail.com> Hi Julie, Emily, It's worth noting that the TTG and CTG start codons are totally expected and officially part of the Standard Code: https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG1 Also the GTG start codon is expected for transcripts found on the Mitochondrial chromosome and is officially part of the Vertebrate Mitochondrial Code: https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG2 So what breaks the rule in the case of transcript ENST00000678634.1 is that the GTG start codon is found on chromosome 20. H. On 3/10/21 7:58 AM, Emily Perry wrote: > Hi Julie > > The information is stored in the transcript_attrib table under > attribution_id 380: > http://www.ensembl.org/info/docs/api/core/core_schema.html#transcript_attrib > > > You can fetch it from the Perl API using $transcript->get_all_Attributes > http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Transcript.html#a59f9ff2079a28ba80bc8b62a5e636327 > > > All the best > > Emily > >> On 10 Mar 2021, at 14:37, Julie Sullivan > > wrote: >> >> Thank you! That answers my question! >> >> I would really like to be able to access that tag (non-ATG start) >> programmatically. Are there plans for putting it with the other >> transcript flags in the GTF file? >> >> On Wed, 10 Mar 2021 at 10:09, Emily Perry > > wrote: >> >> Hi Julie >> >> We have some information about non-ATG start codons in our blog >> post from release 102: >> https://www.ensembl.info/2020/11/30/ensembl-102-has-been-released/ >> >> >> Quite simply, there is not a rule. This is a situation of >> exceptional biology which we are only able to annotate correctly >> because of our expert manual gene annotators analysing the data in >> detail. >> >> All the best >> >> Emily >> >>> On 10 Mar 2021, at 09:08, Julie Sullivan >>> > wrote: >>> >>> https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000288649;r=20:33667144-33668235;t=ENST00000678634 >>> >>> The first codon is GTG. I would not have expected that to be >>> Methionine. >>> >>> I looked in the text files, and there are 123 of these >>> transcripts where the start codon is NOT ATG but the aa is M, in >>> Homo sapiens. >>> {'error': 0, >>> 'methionine': 91434, >>> 'GTG': 22, >>> 'ATA': 10, >>> 'CTG': 67, >>> 'ACG': 8, >>> 'TTG': 9, >>> 'ATT': 5, >>> 'AAC': 1, >>> 'AAG': 1} >>> >>> Why is that? >>> >>> Specifically I would like a rule I can use, as my HGVSp strings >>> are different from VEP for this reason. >>> >>> Thanks! >>> Julie >>> _______________________________________________ >>> Dev mailing list Dev at ensembl.org >>> Posting guidelines and subscribe/unsubscribe info: >>> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org >>> >>> Ensembl Blog: http://www.ensembl.info/ >> >> ? >> >> Dr Emily Perry (Pritchard) >> Ensembl Outreach Project Leader >> (she/her) >> >> European Bioinformatics Institute (EMBL-EBI) >> European Molecular Biology Laboratory >> Wellcome Genome Campus >> Hinxton >> Cambridge >> CB10 1SD >> UK >> >> >> >> >> _______________________________________________ >> Dev mailing list Dev at ensembl.org >> Posting guidelines and subscribe/unsubscribe info: >> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org >> >> Ensembl Blog: http://www.ensembl.info/ >> >> _______________________________________________ >> Dev mailing list Dev at ensembl.org >> Posting guidelines and subscribe/unsubscribe info: >> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org >> >> Ensembl Blog: http://www.ensembl.info/ > > ? > > Dr Emily Perry (Pritchard) > Ensembl Outreach Project Leader > (she/her) > > European Bioinformatics Institute (EMBL-EBI) > European Molecular Biology Laboratory > Wellcome Genome Campus > Hinxton > Cambridge > CB10 1SD > UK > > > > > > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/ > -- Herv? Pag?s Bioconductor Core Team hpages.on.github at gmail.com From ms30 at ebi.ac.uk Mon Mar 15 16:15:13 2021 From: ms30 at ebi.ac.uk (Michal Szpak) Date: Mon, 15 Mar 2021 16:15:13 +0000 Subject: [ensembl-dev] Declarations of Intentions for Ensembl 104 and Ensembl Genomes 51 Message-ID: <589d012fcc7e9869b0d91fcd6d7b271c@ebi.ac.uk> Dear All, Ensembl release 104 and Ensembl Genomes release 51 are scheduled for mid April 2021. Please find the summary of the declarations of intentions in our blog: https://www.ensembl.info/2021/03/15/whats-coming-in-ensembl-release-104-ensembl-genomes-51/ As with all releases, please note that these are intentions and are not guaranteed to make it into the releases. Best wishes, Michal On behalf of the Ensembl team -- Micha? Szpak, Ph.D. Ensembl Outreach Officer European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Genome Campus Hinxton, Cambridge, CB10 1SD, UK From anja at ebi.ac.uk Mon Mar 22 11:37:04 2021 From: anja at ebi.ac.uk (Anja Thormann) Date: Mon, 22 Mar 2021 11:37:04 +0000 Subject: [ensembl-dev] FTP + variation rs id synonym mappings In-Reply-To: <20210218180950.5BAFF62CEC2_2EAD6EB@hh-mx3.ebi.ac.uk> References: <20210212053031.CF15562AD6E_261277B@hh-mx3.ebi.ac.uk> <20210218180950.5BAFF62CEC2_2EAD6EB@hh-mx3.ebi.ac.uk> Message-ID: <08F52DB5-11DD-4660-A744-F74C6551AE33@ebi.ac.uk> Hi Danny, you will get the most detailed information on the merge history of an rs id from dbSNP. I recommend that you take a look at dbSNP's API: https://api.ncbi.nlm.nih.gov/variation/v0/ Or flat files from: https://ftp.ncbi.nih.gov/snp/latest_release/ This file contains the merge information: https://ftp.ncbi.nih.gov/snp/latest_release/JSON/refsnp-merged.json.bz2 And here is an example of using the API: Getting information for rs10001600 (https://www.ncbi.nlm.nih.gov/snp/rs10001600): https://api.ncbi.nlm.nih.gov/variation/v0/beta/refsnp/10001600 where merged_snapshot_data stores the id history. We are not extracting the full merge history for each rs id into Ensembl and therefore wouldn?t give a complete picture and decided against adding this information into our data dumps. Best wishes, Anja > On 18 Feb 2021, at 18:08, Andrew Parton wrote: > > Hi Danny, > > Currently, we do not have a file contains all of these mappings. However, VEP will allow you to annotate your VCFs with the variation synonym data that we have, by providing known synonyms for colocated variants: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_var_synonyms > > Additionally, it may be possible for us to generate these synonyms in a single file as part of our next release, however VEP should be a quicker solution for you. > > Kind Regards, > Andrew > > >> On 12 Feb 2021, at 05:29, danny.kunz at gmx.de wrote: >> >> Hi all, >> >> Quick question: >> >> Our pipeline has to deal with VCF from older assembly releases from the GRCH37 branch. >> >> We tried utilizing the FTP variation VCF files, but realized that we only have hits in about 40% of the patient VCF ids matched within the FTP variation data. >> >> Obviously the old rs ids (synonyms) from the older assemblies are not contained in those newer releases. >> >> Is there any file on the FTP which contains those synonym mappings? >> >> - >> >> Calling the REST api does not cause a problem with the old rs ids as it translates them to the newer ones, but if we want to reduce the REST communication overhead, it would be helpful to be able to achieve the same with the FTP data, right? >> >> Thanks, >> Danny >> _______________________________________________ >> Dev mailing list Dev at ensembl.org >> Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org >> Ensembl Blog: http://www.ensembl.info/ > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From danny.kunz at gmx.de Thu Mar 25 14:26:19 2021 From: danny.kunz at gmx.de (danny.kunz at gmx.de) Date: Thu, 25 Mar 2021 15:26:19 +0100 Subject: [ensembl-dev] FTP + variation rs id synonym mappings In-Reply-To: <08F52DB5-11DD-4660-A744-F74C6551AE33@ebi.ac.uk> References: <20210212053031.CF15562AD6E_261277B@hh-mx3.ebi.ac.uk> <20210218180950.5BAFF62CEC2_2EAD6EB@hh-mx3.ebi.ac.uk> <08F52DB5-11DD-4660-A744-F74C6551AE33@ebi.ac.uk> Message-ID: <06ea01d72182$d38bd260$7aa37720$@gmx.de> Hi Anja, thank you very much for that information! The ftp data set of dbsnp was exactly what I was searching for. Pretty large data size but should not be a problem for our pipeline. Thanks for pointing me to it! Best regards, Danny Von: Dev Im Auftrag von Anja Thormann Gesendet: Montag, 22. M?rz 2021 12:37 An: Ensembl developers list Betreff: Re: [ensembl-dev] FTP + variation rs id synonym mappings Hi Danny, you will get the most detailed information on the merge history of an rs id from dbSNP. I recommend that you take a look at dbSNP's API: https://api.ncbi.nlm.nih.gov/variation/v0/ Or flat files from: https://ftp.ncbi.nih.gov/snp/latest_release/ This file contains the merge information: https://ftp.ncbi.nih.gov/snp/latest_release/JSON/refsnp-merged.json.bz2 And here is an example of using the API: Getting information for rs10001600 (https://www.ncbi.nlm.nih.gov/snp/rs10001600): https://api.ncbi.nlm.nih.gov/variation/v0/beta/refsnp/10001600 where merged_snapshot_data stores the id history. We are not extracting the full merge history for each rs id into Ensembl and therefore wouldn?t give a complete picture and decided against adding this information into our data dumps. Best wishes, Anja On 18 Feb 2021, at 18:08, Andrew Parton > wrote: Hi Danny, Currently, we do not have a file contains all of these mappings. However, VEP will allow you to annotate your VCFs with the variation synonym data that we have, by providing known synonyms for colocated variants: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_var_synonyms Additionally, it may be possible for us to generate these synonyms in a single file as part of our next release, however VEP should be a quicker solution for you. Kind Regards, Andrew On 12 Feb 2021, at 05:29, danny.kunz at gmx.de wrote: Hi all, Quick question: Our pipeline has to deal with VCF from older assembly releases from the GRCH37 branch. We tried utilizing the FTP variation VCF files, but realized that we only have hits in about 40% of the patient VCF ids matched within the FTP variation data. Obviously the old rs ids (synonyms) from the older assemblies are not contained in those newer releases. Is there any file on the FTP which contains those synonym mappings? - Calling the REST api does not cause a problem with the old rs ids as it translates them to the newer ones, but if we want to reduce the REST communication overhead, it would be helpful to be able to achieve the same with the FTP data, right? Thanks, Danny _______________________________________________ Dev mailing list Dev at ensembl.org Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org Ensembl Blog: http://www.ensembl.info/ _______________________________________________ Dev mailing list Dev at ensembl.org Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org Ensembl Blog: http://www.ensembl.info/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From vishalm434 at gmail.com Fri Mar 26 18:48:32 2021 From: vishalm434 at gmail.com (Vishal Mishra) Date: Sat, 27 Mar 2021 00:18:32 +0530 Subject: [ensembl-dev] Interested in Contributing Ensembl Compara In-Reply-To: References: Message-ID: Hi, I am Vishal Mishra, an undergraduate student in CS department at IIIT Lucknow, I found this project very interesting. I have cloned API and ran the tutorial given on the official website. I am looking to work on the test suite for the ensemble database's update that happens twice a year. I am familiar with software development and testing in python and eager to learn perl for this project. Can someone point out how to proceed further? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rohitlucknow14 at gmail.com Sat Mar 27 06:59:53 2021 From: rohitlucknow14 at gmail.com (Rohit Verma) Date: Sat, 27 Mar 2021 12:29:53 +0530 Subject: [ensembl-dev] Gsoc 2021 Message-ID: Hi everyone,I am Rohit Verma an undergraduate student in CS branch at IIIT Lucknow from India, I found this project very interesting and had a keen intrest in contributing to it. I am looking forward to work on the Idea "Compliance Test Suite for Ensembl Web APIs". I have a good knowledge of python and its testing framework pytest. Can any one from the Mentors or members guide me how to proceed further. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ayates at ebi.ac.uk Mon Mar 29 10:10:48 2021 From: ayates at ebi.ac.uk (Andy Yates) Date: Mon, 29 Mar 2021 10:10:48 +0100 Subject: [ensembl-dev] Gsoc 2021 In-Reply-To: <20210327070118.BF5CC62B929_5ED83EB@hh-mx3.ebi.ac.uk> References: <20210327070118.BF5CC62B929_5ED83EB@hh-mx3.ebi.ac.uk> Message-ID: Dear Rohit, Many thanks for the email. I have setup a ticket for yourself in our Helpdesk system where we?ll be able to send you more information concerning the project Andy ------------ Have you filled in EMBL-EBI?s impact survey? We?d really appreciate your input. https://www.surveymonkey.co.uk/r/EMBL-EBI_Impact_PE Andrew Yates - Genomics Technology Infrastructure Team Leader, Deputy Head Ensembl Project The European Bioinformatics Institute (EMBL-EBI) Wellcome Genome Campus Hinxton, Cambridge CB10 1SD, United Kingdom Tel: +44-(0)1223-492538 Fax: +44-(0)1223-494468 Skype: andy.yates.ebi http://www.ebi.ac.uk/ http://www.ensembl.org/ > On 27 Mar 2021, at 06:59, Rohit Verma wrote: > > Hi everyone,I am Rohit Verma an undergraduate student in CS branch at IIIT Lucknow from India, I found this project very interesting and had a keen intrest in contributing to it. I am looking forward to work on the Idea "Compliance Test Suite for Ensembl Web APIs". I have a good knowledge of python and its testing framework pytest. > Can any one from the Mentors or members guide me how to proceed further. > _______________________________________________ > Dev mailing list Dev at ensembl.org > Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org > Ensembl Blog: http://www.ensembl.info/