[ensembl-dev] about dbSNP deleted snps (was Re: Transcript variation alleles -)
Pontus Larsson
Pontus.Larsson at ebi.ac.uk
Tue Feb 8 14:08:47 GMT 2011
Hi Pablo,
Hi Pablo,
The answer is no, we don't store subsnps to that level of detail. You
may be able to retrieve the subsnp_id for a particular submitter (e.g.
AFFY) via the allele and subsnp_handle tables but since you would have
to join to the variation (and variation_synonym and source) table using
the variation_id, there is no guarantee that the subsnp_id would
corresponds to the particular source you're looking at.
Cheers
/Pontus
On 08/02/2011 13:46, Pablo Marin-Garcia wrote:
> On Tue, 8 Feb 2011, Fiona Cunningham wrote:
>
>> hi Pablo,
>>
>> We do not store deleted SS IDs from dbSNP. If you need this
>> information you can search the NCBI using a URL with this format:
>> http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624
>>
>> regards,
>>
>
> Thanks Fiona,
>
> one more question about this.
>
> If I have a chip that is in variation.source (like affy_500k), given
> an rs in the chip, can I retrieve (with the API or sql) the SS that
> they supplied to dbSNP?
>
> -Pablo
>
>
>> Fiona
>>
>> ------------------------------------------------------
>> Fiona Cunningham
>> Ensembl Variation Project Leader, EBI
>> www.ensembl.org
>> www.lrg-sequence.org
>> t: 01223 494612 || e: fiona at ebi.ac.uk
>>
>>
>>
>> On 8 February 2011 11:21, Pontus Larsson <Pontus.Larsson at ebi.ac.uk>
>> wrote:
>>> In the variation_synonym table, we do store rsIds that have been merged
>>> (e.g. 'rs67044803' that has been merged into 'rs67041202'), and you
>>> can get
>>> the corresponding variations by e.g. searching on the website or
>>> querying
>>> the API. However, we currently do not store data for rsIds or ssIds
>>> that
>>> have been removed from the database.
>>>
>>> Cheers
>>> /Pontus
>>>
>>>
>>> On 08/02/2011 10:48, Pablo Marin-Garcia wrote:
>>>>
>>>> On Tue, 8 Feb 2011, Pontus Larsson wrote:
>>>>
>>>>> Hi Pablo,
>>>>>
>>>>> The validation status terms in the file on the ftp server were taken
>>>>> directly from the dbSNP database and the exact phrasing is not
>>>>> always the
>>>>> same that we use. I've updated the file to instead use the same
>>>>> terms that
>>>>> you would find in Ensembl. I hope that clears up the confusion.
>>>>>
>>>>> In Ensembl 60, the variation database was built from dbSNP release
>>>>> 131
>>>>> while the current database is built on dbSNP 132. Generally we
>>>>> don't have
>>>>> access to archived versions of the dbSNP database, so I can't
>>>>> really tell
>>>>> you why particular validation statuses have changed. For rs56, I
>>>>> was able to
>>>>> access the dbSNP data for release 130 and one of the subsnps
>>>>> clustered into
>>>>> this rs was indeed submitted by 1000Genomes (ss113307624). When
>>>>> you search
>>>>> for this subsnp on the dbSNP website, it appears to have failed
>>>>> quality
>>>>> controls
>>>>> (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624),
>>>>>
>>>>> so that would be your explanation. For the other rsIds, I'd
>>>>> recommend that
>>>>> you contact the dbSNP helpdesk directly to find out the details.
>>>>>
>>>>> Hope this helps
>>>>> /Pontus
>>>>
>>>> Thanks a lot Pontus.
>>>>
>>>> Is ensembl storing this data? If I ask in ensembl for a SS or SNP
>>>> that has
>>>> been removed would I obtain only undef? If you don't have it now,
>>>> Do you
>>>> plan to have this info in the future?. The logic for having them
>>>> would be
>>>> the same used for having the ensembl_qc failed SNPs not being
>>>> removed from
>>>> the database but filtered out by default in the queries unless
>>>> explicitly
>>>> requested.
>>>>
>>>> If I have a SS or RS from an old clinical study, I would rather
>>>> like to be
>>>> able to handle this situations in my scripts (log why this SNP or
>>>> SS is not
>>>> longer available instead returning 'not_found'). If ensembl does
>>>> not store
>>>> this, but dbSNP still has it, do someone know how to retrieve this
>>>> cases
>>>> with NCBI biotools/eUtils or similar tool (I have not used it for
>>>> several
>>>> years)?
>>>>
>>>>
>>>> -Pablo
>>>>
>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> -----
>>>>
>>>> Pablo Marin-Garcia
>>>>
>>>>
>>>>
>>>
>>> --
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Pontus Larsson, Ph.D.
>>> Ensembl Variation
>>>
>>> EMBL-EBI
>>> Wellcome Trust Genome Campus
>>> Hinxton, Cambridge, CB10 1SD
>>> UK
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list
>>> Dev at ensembl.org
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>
>>
>
>
> -----
>
> Pablo Marin-Garcia
>
>
>
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Pontus Larsson, Ph.D.
Ensembl Variation
EMBL-EBI
Wellcome Trust Genome Campus
Hinxton, Cambridge, CB10 1SD
UK
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the Dev
mailing list