[ensembl-dev] FTP + variation rs id synonym mappings
anja at ebi.ac.uk
Mon Mar 22 11:37:04 GMT 2021
you will get the most detailed information on the merge history of an rs id from dbSNP.
I recommend that you take a look at dbSNP's API:
Or flat files from:
This file contains the merge information: https://ftp.ncbi.nih.gov/snp/latest_release/JSON/refsnp-merged.json.bz2
And here is an example of using the API:
Getting information for rs10001600 (https://www.ncbi.nlm.nih.gov/snp/rs10001600):
https://api.ncbi.nlm.nih.gov/variation/v0/beta/refsnp/10001600 where merged_snapshot_data stores the id history.
We are not extracting the full merge history for each rs id into Ensembl and therefore wouldn’t give a complete picture and decided against adding this information into our data dumps.
> On 18 Feb 2021, at 18:08, Andrew Parton <aparton at ebi.ac.uk> wrote:
> Hi Danny,
> Currently, we do not have a file contains all of these mappings. However, VEP will allow you to annotate your VCFs with the variation synonym data that we have, by providing known synonyms for colocated variants: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_var_synonyms <https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_var_synonyms>
> Additionally, it may be possible for us to generate these synonyms in a single file as part of our next release, however VEP should be a quicker solution for you.
> Kind Regards,
>> On 12 Feb 2021, at 05:29, danny.kunz at gmx.de <mailto:danny.kunz at gmx.de> wrote:
>> Hi all,
>> Quick question:
>> Our pipeline has to deal with VCF from older assembly releases from the GRCH37 branch.
>> We tried utilizing the FTP variation VCF files, but realized that we only have hits in about 40% of the patient VCF ids matched within the FTP variation data.
>> Obviously the old rs ids (synonyms) from the older assemblies are not contained in those newer releases.
>> Is there any file on the FTP which contains those synonym mappings?
>> Calling the REST api does not cause a problem with the old rs ids as it translates them to the newer ones, but if we want to reduce the REST communication overhead, it would be helpful to be able to achieve the same with the FTP data, right?
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org <https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org>
>> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Dev