[ensembl-dev] Almost no annotation with v61 EFG Array Mapping

Nathan Johnson njohnson at ebi.ac.uk
Thu Mar 10 13:48:45 GMT 2011


Gavin

To understand what these are, you need to look at them in the database.  Look at the unmapped_object table and the unmapped_reason table for more details.
Either that, or look in the probe2transcript output, as described in the array_mapping.txt documentation.

Nath




On 10 Mar 2011, at 13:34, Oliver, Gavin wrote:

> Any ideas on this Nathan?  With so many unmapped objects does it suggest some problem with the reference data i.e. my funcgen database?
>  
>  
>  
> From: Oliver, Gavin 
> Sent: 09 March 2011 08:26
> To: 'Nathan Johnson'
> Cc: dev
> Subject: RE: [ensembl-dev] Almost no annotation with v61 EFG Array Mapping
>  
> Hi Nathan,
>  
> Most seem to be unmapped (see the stats below).
>  
> Monitor looks fine.
>  
> Any ideas?  Undoubtedly it will be something simple I’m overlooking!
>  
>  
> ::      Logging probesets that don't map to any transcripts - Tue Mar  8 17:25:20 2011
> ::      Updating 0 promiscuous probesets - Tue Mar  8 17:38:45 2011
> ::      UnmappedObjects loaded:                 598994
> ::      ADXBRCv2a520413 distinct ProbeSet xrefs mapped(total xrefs):    1/60856(1)
> ::      ADXOCv1a520630 distinct ProbeSet xrefs mapped(total xrefs):     4/120373(7)
> ::      ADXECv1a520743 distinct ProbeSet xrefs mapped(total xrefs):     4/111012(7)
> ::      ADXLCv1a520538 distinct ProbeSet xrefs mapped(total xrefs):     4/60416(10)
> ::      ADXCRCG2a520319 distinct ProbeSet xrefs mapped(total xrefs):    2/61528(2)
> ::      ADXPCv1a520642 distinct ProbeSet xrefs mapped(total xrefs):     2/121563(2)
> ::      HG-U133_Plus_2 distinct ProbeSet xrefs mapped(total xrefs):     1/54675(4)
> ::      Mapped 9/167074 transcripts  - Tue Mar  8 17:38:57 2011
>  
> From: Nathan Johnson [mailto:njohnson at ebi.ac.uk] 
> Sent: 03 March 2011 11:09
> To: Oliver, Gavin
> Cc: dev
> Subject: Re: [ensembl-dev] Almost no annotation with v61 EFG Array Mapping
>  
> Hi Gavin
>  
> Have you checked the unmapped objects? These should give you a clue.
>  
> Also, you might want to check the pipeline output  i.e. 'monitor' and the ProbeAlign reports. This should show you see exactly what features you are trying to xref.
>  
> Nathan Johnson
> Scientific Programmer
> European Bioinformatics Institute
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> Email: njohnson at ebi.ac.uk
> TelNo: (+44)1223 492629
>  
>  
>  
> On 2 Mar 2011, at 12:31, Oliver, Gavin wrote:
>  
> 
> Hi,
>  
> I have rerun my array annotations with v61.  Everything seemed fine until I looked at the log and saw that perhaps 2 probesets per array have actually mapped to a transcript.
> Can you think of something obvious that might have gone wrong here?  The log output is below.
>  
> Best,
>  
> Gavin
>  
>  
> (Nathan I’m posting here incase you didn’t receive my mail)
>  
>  
> ::      ::      Running on probe2transcript.pl on: ni-cr-svc-slv.phar-services.com   ::      :: - Mon Feb 21 09:01:02 2011
>  
> ::      Params are:     --species homo_sapiens --transcript_dbname homo_sapiens_core_61_37f --transcript_host ni-cr-svc-bs3 --transcript_port 3306 --transcript_user root --transcript_pass pass --xref_host ni-cr-svc-bs3 --xref_dbname homo_sapiens_funcgen_61_37f --xref_user root --xref_pass pass --unannotated_5_utr 0 --unannotated_3_utr 0 -annotated_5_prime_extend 0 -annotated_3_prime_extend 0 --threshold 0.000005 --mismatches 2 -vendor AFFY -format AFFY_UTR -arrays ADXOCv1a520630 ADXBRCv2a520413 ADXECv1a520743 ADXLCv1a520538 ADXCRCG2a520319 HG-U133_Plus_2 ADXPCv1a520642 -import_edb
> ::      No probe DB params specified, defaulting to xref params
>  
>  
> ::      ::      Checking existing Xrefs ::      ::
>  
> ::      You have specified -utr_multiplier and a -3|5_prime_extend, -3|5_prime_extend will override where appropriate
> ::      Setting 3 unannotated UTR length to 0
> ::      Setting 5 unannotated UTR length to 0
> ::      Identified 167074 transcripts for probe mapping
> ::      Allowed mismatches = 2
> ::      Caching arrays per ProbeSet - Mon Feb 21 09:01:21 2011
> ::      Performing overlap analysis. % Complete:
> ::      0 ::    1 ::    2 ::    3 ::    4 ::    5 ::    6 ::    7 ::    8 ::    9 ::    10 ::   11 ::   12 ::   13 ::   14 ::   15 ::   16 ::   17 ::   18 ::   19 ::   20 ::   21 ::   22 ::   23 ::   24 ::   25 ::   26 ::   27 ::   28 ::   29 ::   30 ::   31 ::   32 ::   33 ::   34 ::   35 ::   36 ::   37 ::   38 ::   39 ::   40 ::   41 ::   42 ::   43 ::   44 ::   45 ::   46 ::   47 ::   48 ::   49 ::   50 ::   51 ::   52 ::   53 ::   54 ::   55 ::   56 ::   57 ::   58 ::   59 ::   60 ::   61 ::   62 ::   63 ::   64 ::   65 ::   66 ::   67 ::   68 ::   69 ::   70 ::   71 ::   72 ::   73 ::   74 ::   75 ::   76 ::   77 ::   78 ::   79 ::   80 ::   81 ::   82 ::   83 ::   84 ::   85 ::   86 ::   87 ::   88 ::   89 ::   90 ::   91 ::   92 ::   93 ::   94 ::   95 ::   96 ::   97 ::   98 ::   99 ::
> ::      Failed to extend 1 transcripts
> ::      Seen 0 5 prime UTRs with and average length of 0
> ::      Seen 0 3 prime UTRs with and average length of 0
>  
>  
> ::      ::      Writing ADXOCv1a520630 ADXBRCv2a520413 ADXECv1a520743 ADXLCv1a520538 ADXCRCG2a520319 HG-U133_Plus_2 ADXPCv1a520642 Xrefs        ::      :: - Mon Feb 21 18:26:37 2011
>  
> ::      Logging probesets that don't map to any transcripts - Mon Feb 21 18:26:37 2011
> ::      Updating 0 promiscuous probesets - Mon Feb 21 18:40:28 2011
> ::      UnmappedObjects loaded:                 598994
> ::      ADXBRCv2a520413 distinct ProbeSet xrefs mapped(total xrefs):    1/60856(1)
> ::      ADXOCv1a520630 distinct ProbeSet xrefs mapped(total xrefs):     4/120373(7)
> ::      ADXECv1a520743 distinct ProbeSet xrefs mapped(total xrefs):     4/111012(7)
> ::      ADXLCv1a520538 distinct ProbeSet xrefs mapped(total xrefs):     4/60416(10)
> ::      ADXCRCG2a520319 distinct ProbeSet xrefs mapped(total xrefs):    2/61528(2)
> ::      HG-U133_Plus_2 distinct ProbeSet xrefs mapped(total xrefs):     1/54675(4)
> ::      ADXPCv1a520642 distinct ProbeSet xrefs mapped(total xrefs):     2/121563(2)
> ::      Mapped 9/167074 transcripts  - Mon Feb 21 18:40:39 2011
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>  
>  
>  
>  
>  
> The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender.
>  
> Almac Group (UK) Limited, registered no. NI061368.  Almac Sciences Limited, registered no. NI041550.  Almac Discovery Limited, registered no. NI046249.  Almac Pharma Services Limited, registered no. NI045055.  Almac Clinical Services Limited, registered no. NI041905.  Almac Clinical Technologies Limited, registered no. NI061202.  Almac Diagnostics Limited, registered no. NI043067.  All preceding companies are registered in Northern Ireland with a registered office address of Almac House, 20 Seagoe Industrial Estate, Craigavon, BT63 5QD, UK.  
>  
> Almac Sciences (Scotland) Limited, registered in Scotland no. SC154034.
>  
> Almac Clinical Services LLC, Almac Clinical Technologies LLC, Almac Diagnostics LLC, Almac Pharma Services LLC and Almac Sciences LLC are Delaware limited liability companies and Almac Group Incorporated is a Delaware Corporation.  More information on the Almac Group can be found on the Almac website: www.almacgroup.com

Nathan Johnson
Scientific Programmer
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
Email: njohnson at ebi.ac.uk
TelNo: (+44)1223 492629





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110310/6ff9e1d5/attachment.html>


More information about the Dev mailing list