[ensembl-dev] about dbSNP deleted snps (was Re: Transcript variation alleles -)
Pablo Marin-Garcia
pg4 at sanger.ac.uk
Tue Feb 8 14:14:03 GMT 2011
On Tue, 8 Feb 2011, Pontus Larsson wrote:
> Hi Pablo,
>
> Hi Pablo,
>
> The answer is no, we don't store subsnps to that level of detail. You may be
> able to retrieve the subsnp_id for a particular submitter (e.g. AFFY) via the
> allele and subsnp_handle tables but since you would have to join to the
> variation (and variation_synonym and source) table using the variation_id,
> there is no guarantee that the subsnp_id would corresponds to the particular
> source you're looking at.
Thanks Pontus,
seems that I will need to find and learn the dbSNP sql schema .... but probably
in another life ;-)
Cheers
-Pablo
>
> Cheers
> /Pontus
>
>
> On 08/02/2011 13:46, Pablo Marin-Garcia wrote:
>> On Tue, 8 Feb 2011, Fiona Cunningham wrote:
>>
>>> hi Pablo,
>>>
>>> We do not store deleted SS IDs from dbSNP. If you need this
>>> information you can search the NCBI using a URL with this format:
>>> http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624
>>> regards,
>>>
>>
>> Thanks Fiona,
>>
>> one more question about this.
>>
>> If I have a chip that is in variation.source (like affy_500k), given an rs
>> in the chip, can I retrieve (with the API or sql) the SS that they supplied
>> to dbSNP?
>>
>> -Pablo
>>
>>
>>> Fiona
>>>
>>> ------------------------------------------------------
>>> Fiona Cunningham
>>> Ensembl Variation Project Leader, EBI
>>> www.ensembl.org
>>> www.lrg-sequence.org
>>> t: 01223 494612 || e: fiona at ebi.ac.uk
>>>
>>>
>>>
>>> On 8 February 2011 11:21, Pontus Larsson <Pontus.Larsson at ebi.ac.uk> wrote:
>>>> In the variation_synonym table, we do store rsIds that have been merged
>>>> (e.g. 'rs67044803' that has been merged into 'rs67041202'), and you can
>>>> get
>>>> the corresponding variations by e.g. searching on the website or querying
>>>> the API. However, we currently do not store data for rsIds or ssIds that
>>>> have been removed from the database.
>>>>
>>>> Cheers
>>>> /Pontus
>>>>
>>>>
>>>> On 08/02/2011 10:48, Pablo Marin-Garcia wrote:
>>>>>
>>>>> On Tue, 8 Feb 2011, Pontus Larsson wrote:
>>>>>
>>>>>> Hi Pablo,
>>>>>>
>>>>>> The validation status terms in the file on the ftp server were taken
>>>>>> directly from the dbSNP database and the exact phrasing is not always
>>>>>> the
>>>>>> same that we use. I've updated the file to instead use the same terms
>>>>>> that
>>>>>> you would find in Ensembl. I hope that clears up the confusion.
>>>>>>
>>>>>> In Ensembl 60, the variation database was built from dbSNP release 131
>>>>>> while the current database is built on dbSNP 132. Generally we don't
>>>>>> have
>>>>>> access to archived versions of the dbSNP database, so I can't really
>>>>>> tell
>>>>>> you why particular validation statuses have changed. For rs56, I was
>>>>>> able to
>>>>>> access the dbSNP data for release 130 and one of the subsnps clustered
>>>>>> into
>>>>>> this rs was indeed submitted by 1000Genomes (ss113307624). When you
>>>>>> search
>>>>>> for this subsnp on the dbSNP website, it appears to have failed quality
>>>>>> controls
>>>>>> (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_retrieve.cgi?subsnp_id=ss113307624),
>>>>>> so that would be your explanation. For the other rsIds, I'd recommend
>>>>>> that
>>>>>> you contact the dbSNP helpdesk directly to find out the details.
>>>>>>
>>>>>> Hope this helps
>>>>>> /Pontus
>>>>>
>>>>> Thanks a lot Pontus.
>>>>>
>>>>> Is ensembl storing this data? If I ask in ensembl for a SS or SNP that
>>>>> has
>>>>> been removed would I obtain only undef? If you don't have it now, Do you
>>>>> plan to have this info in the future?. The logic for having them would
>>>>> be
>>>>> the same used for having the ensembl_qc failed SNPs not being removed
>>>>> from
>>>>> the database but filtered out by default in the queries unless
>>>>> explicitly
>>>>> requested.
>>>>>
>>>>> If I have a SS or RS from an old clinical study, I would rather like to
>>>>> be
>>>>> able to handle this situations in my scripts (log why this SNP or SS is
>>>>> not
>>>>> longer available instead returning 'not_found'). If ensembl does not
>>>>> store
>>>>> this, but dbSNP still has it, do someone know how to retrieve this cases
>>>>> with NCBI biotools/eUtils or similar tool (I have not used it for
>>>>> several
>>>>> years)?
>>>>>
>>>>>
>>>>> -Pablo
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----
>>>>>
>>>>> Pablo Marin-Garcia
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> Pontus Larsson, Ph.D.
>>>> Ensembl Variation
>>>>
>>>> EMBL-EBI
>>>> Wellcome Trust Genome Campus
>>>> Hinxton, Cambridge, CB10 1SD
>>>> UK
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list
>>>> Dev at ensembl.org
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>
>>>
>>
>>
>> -----
>>
>> Pablo Marin-Garcia
>>
>>
>>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Pontus Larsson, Ph.D.
> Ensembl Variation
>
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton, Cambridge, CB10 1SD
> UK
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
-----
Pablo Marin-Garcia
More information about the Dev
mailing list