[ensembl-dev] about dbSNP deleted snps (was Re: Transcript variation alleles -)

Pontus Larsson Pontus.Larsson at ebi.ac.uk
Tue Feb 8 14:08:47 GMT 2011


Hi Pablo,

Hi Pablo,

The answer is no, we don't store subsnps to that level of detail. You 
may be able to retrieve the subsnp_id for a particular submitter (e.g. 
AFFY) via the allele and subsnp_handle tables but since you would have 
to join to the variation (and variation_synonym and source) table using 
the variation_id, there is no guarantee that the subsnp_id would 
corresponds to the particular source you're looking at.

Cheers
/Pontus


On 08/02/2011 13:46, Pablo Marin-Garcia wrote:
> On Tue, 8 Feb 2011, Fiona Cunningham wrote:
>
>> hi Pablo,
>>
>> We do not store deleted SS IDs from dbSNP. If you need this
>> information you can search the NCBI using a URL with this format:
>> http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624 
>>
>> regards,
>>
>
> Thanks Fiona,
>
> one more question about this.
>
> If I have a chip that is in variation.source (like affy_500k), given 
> an rs in the chip, can I retrieve (with the API or sql) the SS that 
> they supplied to dbSNP?
>
>   -Pablo
>
>
>> Fiona
>>
>> ------------------------------------------------------
>> Fiona Cunningham
>> Ensembl Variation Project Leader, EBI
>> www.ensembl.org
>> www.lrg-sequence.org
>> t: 01223 494612 || e: fiona at ebi.ac.uk
>>
>>
>>
>> On 8 February 2011 11:21, Pontus Larsson <Pontus.Larsson at ebi.ac.uk> 
>> wrote:
>>> In the variation_synonym table, we do store rsIds that have been merged
>>> (e.g. 'rs67044803' that has been merged into 'rs67041202'), and you 
>>> can get
>>> the corresponding variations by e.g. searching on the website or 
>>> querying
>>> the API. However, we currently do not store data for rsIds or ssIds 
>>> that
>>> have been removed from the database.
>>>
>>> Cheers
>>> /Pontus
>>>
>>>
>>> On 08/02/2011 10:48, Pablo Marin-Garcia wrote:
>>>>
>>>> On Tue, 8 Feb 2011, Pontus Larsson wrote:
>>>>
>>>>> Hi Pablo,
>>>>>
>>>>> The validation status terms in the file on the ftp server were taken
>>>>> directly from the dbSNP database and the exact phrasing is not 
>>>>> always the
>>>>> same that we use. I've updated the file to instead use the same 
>>>>> terms that
>>>>> you would find in Ensembl. I hope that clears up the confusion.
>>>>>
>>>>> In Ensembl 60, the variation database was built from dbSNP release 
>>>>> 131
>>>>> while the current database is built on dbSNP 132. Generally we 
>>>>> don't have
>>>>> access to archived versions of the dbSNP database, so I can't 
>>>>> really tell
>>>>> you why particular validation statuses have changed. For rs56, I 
>>>>> was able to
>>>>> access the dbSNP data for release 130 and one of the subsnps 
>>>>> clustered into
>>>>> this rs was indeed submitted by 1000Genomes (ss113307624). When 
>>>>> you search
>>>>> for this subsnp on the dbSNP website, it appears to have failed 
>>>>> quality
>>>>> controls
>>>>> (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624), 
>>>>>
>>>>> so that would be your explanation. For the other rsIds, I'd 
>>>>> recommend that
>>>>> you contact the dbSNP helpdesk directly to find out the details.
>>>>>
>>>>> Hope this helps
>>>>> /Pontus
>>>>
>>>> Thanks a lot Pontus.
>>>>
>>>> Is ensembl storing this data? If I ask in ensembl for a SS or SNP 
>>>> that has
>>>> been removed would I obtain only undef? If you don't have it now, 
>>>> Do you
>>>> plan to have this info in the future?. The logic for having them 
>>>> would be
>>>> the same used for having the ensembl_qc failed SNPs not being 
>>>> removed from
>>>> the database but filtered out by default in the queries unless 
>>>> explicitly
>>>> requested.
>>>>
>>>> If I have a SS or RS from an old clinical study, I would rather 
>>>> like to be
>>>> able to handle this situations in my scripts (log why this SNP or 
>>>> SS is not
>>>> longer available instead returning 'not_found'). If ensembl does 
>>>> not store
>>>> this, but dbSNP still has it, do someone know how to retrieve this 
>>>> cases
>>>> with NCBI biotools/eUtils or similar tool (I have not used it for 
>>>> several
>>>> years)?
>>>>
>>>>
>>>>  -Pablo
>>>>
>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> -----
>>>>
>>>>  Pablo Marin-Garcia
>>>>
>>>>
>>>>
>>>
>>> -- 
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Pontus Larsson, Ph.D.
>>> Ensembl Variation
>>>
>>> EMBL-EBI
>>> Wellcome Trust Genome Campus
>>> Hinxton, Cambridge, CB10 1SD
>>> UK
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list
>>> Dev at ensembl.org
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>
>>
>
>
> -----
>
>   Pablo Marin-Garcia
>
>
>

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Pontus Larsson, Ph.D.
Ensembl Variation

EMBL-EBI
Wellcome Trust Genome Campus
Hinxton, Cambridge, CB10 1SD
UK
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~





More information about the Dev mailing list