[ensembl-dev] about dbSNP deleted snps (was Re: Transcript variation alleles -)

Pontus Larsson Pontus.Larsson at ebi.ac.uk
Tue Feb 8 11:21:41 GMT 2011


In the variation_synonym table, we do store rsIds that have been merged 
(e.g. 'rs67044803' that has been merged into 'rs67041202'), and you can 
get the corresponding variations by e.g. searching on the website or 
querying the API. However, we currently do not store data for rsIds or 
ssIds that have been removed from the database.

Cheers
/Pontus


On 08/02/2011 10:48, Pablo Marin-Garcia wrote:
> On Tue, 8 Feb 2011, Pontus Larsson wrote:
>
>> Hi Pablo,
>>
>> The validation status terms in the file on the ftp server were taken 
>> directly from the dbSNP database and the exact phrasing is not always 
>> the same that we use. I've updated the file to instead use the same 
>> terms that you would find in Ensembl. I hope that clears up the 
>> confusion.
>>
>> In Ensembl 60, the variation database was built from dbSNP release 
>> 131 while the current database is built on dbSNP 132. Generally we 
>> don't have access to archived versions of the dbSNP database, so I 
>> can't really tell you why particular validation statuses have 
>> changed. For rs56, I was able to access the dbSNP data for release 
>> 130 and one of the subsnps clustered into this rs was indeed 
>> submitted by 1000Genomes (ss113307624). When you search for this 
>> subsnp on the dbSNP website, it appears to have failed quality 
>> controls 
>> (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624), 
>> so that would be your explanation. For the other rsIds, I'd recommend 
>> that you contact the dbSNP helpdesk directly to find out the details.
>>
>> Hope this helps
>> /Pontus
>
> Thanks a lot Pontus.
>
> Is ensembl storing this data? If I ask in ensembl for a SS or SNP that 
> has been removed would I obtain only undef? If you don't have it now, 
> Do you plan to have this info in the future?. The logic for having 
> them would be the same used for having the ensembl_qc failed SNPs not 
> being removed from the database but filtered out by default in the 
> queries unless explicitly requested.
>
> If I have a SS or RS from an old clinical study, I would rather like 
> to be able to handle this situations in my scripts (log why this SNP 
> or SS is not longer available instead returning 'not_found'). If 
> ensembl does not store this, but dbSNP still has it, do someone know 
> how to retrieve this cases with NCBI biotools/eUtils or similar tool 
> (I have not used it for several years)?
>
>
>   -Pablo
>
>
>>
>>
>
>
>
> -----
>
>   Pablo Marin-Garcia
>
>
>

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Pontus Larsson, Ph.D.
Ensembl Variation

EMBL-EBI
Wellcome Trust Genome Campus
Hinxton, Cambridge, CB10 1SD
UK
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~





More information about the Dev mailing list