[ensembl-dev] about dbSNP deleted snps (was Re: Transcript variation alleles -)

Pablo Marin-Garcia pg4 at sanger.ac.uk
Tue Feb 8 13:46:01 GMT 2011


On Tue, 8 Feb 2011, Fiona Cunningham wrote:

> hi Pablo,
>
> We do not store deleted SS IDs from dbSNP. If you need this
> information you can search the NCBI using a URL with this format:
> http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624
> regards,
>

Thanks Fiona,

one more question about this.

If I have a chip that is in variation.source (like affy_500k), given an rs in 
the chip, can I retrieve (with the API or sql) the SS that they supplied to 
dbSNP?

   -Pablo


> Fiona
>
> ------------------------------------------------------
> Fiona Cunningham
> Ensembl Variation Project Leader, EBI
> www.ensembl.org
> www.lrg-sequence.org
> t: 01223 494612 || e: fiona at ebi.ac.uk
>
>
>
> On 8 February 2011 11:21, Pontus Larsson <Pontus.Larsson at ebi.ac.uk> wrote:
>> In the variation_synonym table, we do store rsIds that have been merged
>> (e.g. 'rs67044803' that has been merged into 'rs67041202'), and you can get
>> the corresponding variations by e.g. searching on the website or querying
>> the API. However, we currently do not store data for rsIds or ssIds that
>> have been removed from the database.
>>
>> Cheers
>> /Pontus
>>
>>
>> On 08/02/2011 10:48, Pablo Marin-Garcia wrote:
>>>
>>> On Tue, 8 Feb 2011, Pontus Larsson wrote:
>>>
>>>> Hi Pablo,
>>>>
>>>> The validation status terms in the file on the ftp server were taken
>>>> directly from the dbSNP database and the exact phrasing is not always the
>>>> same that we use. I've updated the file to instead use the same terms that
>>>> you would find in Ensembl. I hope that clears up the confusion.
>>>>
>>>> In Ensembl 60, the variation database was built from dbSNP release 131
>>>> while the current database is built on dbSNP 132. Generally we don't have
>>>> access to archived versions of the dbSNP database, so I can't really tell
>>>> you why particular validation statuses have changed. For rs56, I was able to
>>>> access the dbSNP data for release 130 and one of the subsnps clustered into
>>>> this rs was indeed submitted by 1000Genomes (ss113307624). When you search
>>>> for this subsnp on the dbSNP website, it appears to have failed quality
>>>> controls
>>>> (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624),
>>>> so that would be your explanation. For the other rsIds, I'd recommend that
>>>> you contact the dbSNP helpdesk directly to find out the details.
>>>>
>>>> Hope this helps
>>>> /Pontus
>>>
>>> Thanks a lot Pontus.
>>>
>>> Is ensembl storing this data? If I ask in ensembl for a SS or SNP that has
>>> been removed would I obtain only undef? If you don't have it now, Do you
>>> plan to have this info in the future?. The logic for having them would be
>>> the same used for having the ensembl_qc failed SNPs not being removed from
>>> the database but filtered out by default in the queries unless explicitly
>>> requested.
>>>
>>> If I have a SS or RS from an old clinical study, I would rather like to be
>>> able to handle this situations in my scripts (log why this SNP or SS is not
>>> longer available instead returning 'not_found'). If ensembl does not store
>>> this, but dbSNP still has it, do someone know how to retrieve this cases
>>> with NCBI biotools/eUtils or similar tool (I have not used it for several
>>> years)?
>>>
>>>
>>>  -Pablo
>>>
>>>
>>>>
>>>>
>>>
>>>
>>>
>>> -----
>>>
>>>  Pablo Marin-Garcia
>>>
>>>
>>>
>>
>> --
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Pontus Larsson, Ph.D.
>> Ensembl Variation
>>
>> EMBL-EBI
>> Wellcome Trust Genome Campus
>> Hinxton, Cambridge, CB10 1SD
>> UK
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>>
>> _______________________________________________
>> Dev mailing list
>> Dev at ensembl.org
>> http://lists.ensembl.org/mailman/listinfo/dev
>>
>


-----

   Pablo Marin-Garcia


More information about the Dev mailing list