[ensembl-dev] affy_hg_u133_plus_2 to ensg mappings

Nathan Johnson njohnson at ebi.ac.uk
Mon Jun 17 21:39:32 BST 2013


Hi Oliver

The reason why this isn't being considered as a transcript xref is because it is on the wrong strand.  This is an easy mistake to make as many of the array technologies differ in how they process the RNA sample and hence what strand is actually hybridised when it eventually meets the array.

There is a digram of the IVT processing on this page:

http://www.affymetrix.com/estore/browse/products.jsp?categoryIdClicked=&productId=131415#1_1

In saying that, that particular set of alignments does look like it was designed for the exons of that gene, albeit with some exon boundary overlap. However, IVT arrays normally target 3' ends and UTRs specifically, which makes this particular probeset even more odd.

 Sorry I can't be of more help.

Nathan



On 17 Jun 2013, at 15:58, Oliver Burren <oliver.burren at cimr.cam.ac.uk> wrote:

> Hi,
> 
> I'm trying to retrieve all probset.id mappings to ensembl genes for [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL570) using ensmart 71. However I noticed a large drop out wrt to the GEO annotation file so I did some digging...
> 
> 
> If I look in Biomart for something like this 
> 
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query  virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
> 			
> 	<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
> 		<Filter name = "affy_hg_u133_plus_2" value = "205332_at"/>
> 		<Attribute name = "ensembl_gene_id" />
> 		<Attribute name = "ensembl_transcript_id" />
> 	</Dataset>
> </Query>
> 
> I get no results. However if I search the website for 205332_at and turn on the track for AFFY:HG-U133_Plus_2 it shows that the probeset (6 features) maps to the gene. The help on this page http://www.ensembl.org/info/docs/microarray_probe_set_mapping.html says ' it is normally required that more than 50% of the probes in a probe set hit a given transcript sequence'. Is this the reason why this probeset isn't being tagged to this gene (although this appears to be 60%) ?
> 
> Any light that you could shed would be appreciated. Thanks,
> 
> Olly Burren
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130617/695e33f3/attachment.html>


More information about the Dev mailing list