[ensembl-dev] unmapped/un-displayable SNP from dbsnp

Kim Brugger kim.brugger at easih.ac.uk
Fri Oct 8 16:37:53 BST 2010


On 08/10/10 16:23, Graham Ritchie wrote:
> Hi Kim,
>
> Hmm, this does seem to be an odd case. If you look at the dbSNP entry on this page:
>
> http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp&cmd=search&term=+rs1053738
>
> it does appear to have 4 alleles, but on the page you link the "RefSNP Alleles" are listed as only A/G, but as Bert pointed out the HGVS names are inconsistent with this.
>    
Actually one mRNA states that G>{A,C,T} at one position, which is quite 
a spectacular, and clearly a bug.
> This SNP only had 2 alleles in dbSNP 130, and can be seen in ensembl version 58 here:
>
> http://may2010.archive.ensembl.org/Homo_sapiens/Variation/Summary?v=rs1053738;vdb=variation
>
> It is possible that dbSNP have since (partially) corrected the webpage, but when we did the last import (from dbDNP 131) it was reported as having 4 alleles.
>    
> Hopefully this will be resolved in the next release of dbSNP which will then filter through to ensembl (probably in release 62). We'll certainly take it up with them.
>    
So that will be sometime in one year+ time? As this is now a major issue 
with for my data analysis I will investigate further. I have a gut 
feeling that this is a more than a lucky shot.

Cheers,

Kim

> Cheers,
>
> Graham
>
>
> On 8 Oct 2010, at 15:54, Kim Brugger wrote:
>
>    
>> Hi
>>
>> If you look at the dbsnp page for this snp it is only two alleles A/G for this snp, so it looks like the counting of alleles is faulty. Furthermore the SNP is represented in the 1000 genomes data, and other datasets I deem trustworthy.
>>
>> http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=1053738
>>
>> Thanks for explaining how/why this filtering is done.
>>
>> Cheers,
>>
>> Kim
>>
>> On 08/10/10 14:14, Graham Ritchie wrote:
>>      
>>> Hi Kim,
>>>
>>> This SNP has *more than* 3 alleles, and we have taken the decision to fail all such SNPs, we debated this decision internally recently and Paul concluded as follows:
>>>
>>> "These are still far, far more likely to be errors than real.  While some probably exist, true SNPs with all four alleles require very complex selection pressures to remain in the population and so this number is simply never likely to grow to "many SNPs."  In fact, the word quadallelic does not return any results in Pubmed.
>>>
>>> This does not mean that it will never happen, only that it is very, very rare.  Note that we don't fail triallelic SNPs, which are also rare and enriched for error."
>>>
>>> Hope this makes sense. If you have example of SNPs that don't appear for other reasons then please let us know. We do track all SNPs we fail and the reason for doing so in the failed_variation table of the variation database.
>>>
>>> Cheers,
>>>
>>> Graham
>>>
>>>
>>> On 8 Oct 2010, at 13:54, Kim Brugger wrote:
>>>
>>>
>>>        
>>>> Hi
>>>>
>>>> I am looking for the rs1053738 snp. When I do a search on the ensembl-web it is found and it exists with 2 synonyms, but if I want to display I am told it was not mapped as the variation has 3 alleles.
>>>>
>>>> The SNP should be located at  3:124951820-124951821. I have a large set of snps that I cannot find either with the ensembl-web or using the api.
>>>>
>>>> Cheers,
>>>>
>>>> Kim
>>>>
>>>> -- 
>>>> ==========================================================
>>>> Kim Brugger
>>>> EASIH, University of Cambridge
>>>> www.easih.ac.uk
>>>> ==========================================================
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list
>>>> Dev at ensembl.org
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>
>>>>          
>>>
>>>        
>>
>> -- 
>> ==========================================================
>> Kim Brugger
>> EASIH, University of Cambridge
>> www.easih.ac.uk
>> ==========================================================
>>      
>    


-- 
==========================================================
Kim Brugger
EASIH, University of Cambridge
www.easih.ac.uk
==========================================================





More information about the Dev mailing list