[ensembl-dev] about dbSNP deleted snps (was Re: Transcript variation alleles -)

Fiona Cunningham fiona at ebi.ac.uk
Tue Feb 8 11:24:44 GMT 2011


 hi Pablo,

We do not store deleted SS IDs from dbSNP. If you need this
information you can search the NCBI using a URL with this format:
http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624
regards,

Fiona

------------------------------------------------------
Fiona Cunningham
Ensembl Variation Project Leader, EBI
www.ensembl.org
www.lrg-sequence.org
t: 01223 494612 || e: fiona at ebi.ac.uk



On 8 February 2011 11:21, Pontus Larsson <Pontus.Larsson at ebi.ac.uk> wrote:
> In the variation_synonym table, we do store rsIds that have been merged
> (e.g. 'rs67044803' that has been merged into 'rs67041202'), and you can get
> the corresponding variations by e.g. searching on the website or querying
> the API. However, we currently do not store data for rsIds or ssIds that
> have been removed from the database.
>
> Cheers
> /Pontus
>
>
> On 08/02/2011 10:48, Pablo Marin-Garcia wrote:
>>
>> On Tue, 8 Feb 2011, Pontus Larsson wrote:
>>
>>> Hi Pablo,
>>>
>>> The validation status terms in the file on the ftp server were taken
>>> directly from the dbSNP database and the exact phrasing is not always the
>>> same that we use. I've updated the file to instead use the same terms that
>>> you would find in Ensembl. I hope that clears up the confusion.
>>>
>>> In Ensembl 60, the variation database was built from dbSNP release 131
>>> while the current database is built on dbSNP 132. Generally we don't have
>>> access to archived versions of the dbSNP database, so I can't really tell
>>> you why particular validation statuses have changed. For rs56, I was able to
>>> access the dbSNP data for release 130 and one of the subsnps clustered into
>>> this rs was indeed submitted by 1000Genomes (ss113307624). When you search
>>> for this subsnp on the dbSNP website, it appears to have failed quality
>>> controls
>>> (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624),
>>> so that would be your explanation. For the other rsIds, I'd recommend that
>>> you contact the dbSNP helpdesk directly to find out the details.
>>>
>>> Hope this helps
>>> /Pontus
>>
>> Thanks a lot Pontus.
>>
>> Is ensembl storing this data? If I ask in ensembl for a SS or SNP that has
>> been removed would I obtain only undef? If you don't have it now, Do you
>> plan to have this info in the future?. The logic for having them would be
>> the same used for having the ensembl_qc failed SNPs not being removed from
>> the database but filtered out by default in the queries unless explicitly
>> requested.
>>
>> If I have a SS or RS from an old clinical study, I would rather like to be
>> able to handle this situations in my scripts (log why this SNP or SS is not
>> longer available instead returning 'not_found'). If ensembl does not store
>> this, but dbSNP still has it, do someone know how to retrieve this cases
>> with NCBI biotools/eUtils or similar tool (I have not used it for several
>> years)?
>>
>>
>>  -Pablo
>>
>>
>>>
>>>
>>
>>
>>
>> -----
>>
>>  Pablo Marin-Garcia
>>
>>
>>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Pontus Larsson, Ph.D.
> Ensembl Variation
>
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton, Cambridge, CB10 1SD
> UK
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>




More information about the Dev mailing list