[ensembl-dev] Missing IDS in ENSEMBL database

Thibaut Hourlier th3 at sanger.ac.uk
Fri May 25 13:34:09 BST 2012


Hi Duarte,
I apologize for the misspelling of the method.
It should work with the transcript adaptor. Otherwise, as you said, using the gene adaptor and looping on the transcripts should work.

Regards
Thibaut

On 25 May 2012, at 12:40, Duarte Molha wrote:

> Ok... 
> 
> But that was the method I was using on my code already. 
> 
> So what Thibaut was suggesting is that I use the method call on a gene adaptor to retrieve the gene and then use that gene to retrieve its transcripts?
> 
> Best regards
> 
> Duarte
> 
> 
> -----Original Message-----
> From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On Behalf Of Carlos Garcia Giron
> Sent: 25 May 2012 12:36
> To: Ensembl developers list
> Subject: Re: [ensembl-dev] Missing IDS in ENSEMBL database
> 
> Dear Duarte,
> 
> The method is called "fetch_all_by_external_name" and it can be found for:
> 
> Bio::EnsEMBL::DBSQL::GeneAdaptor::fetch_all_by_external_name()
> Bio::EnsEMBL::DBSQL::TranscriptAdaptor::fetch_all_by_external_name()
> Bio::EnsEMBL::DBSQL::TranslationAdaptor::fetch_all_by_external_name()
> 
> I hope it helps.
> 
> Kind regards,
> Carlos
> 
> Duarte Molha wrote:
>> 
>> Dear Thibaut Hourlier
>> 
>> I was searching the doxygen documentation for that method call you 
>> indicated but do not seem to be able to find it.
>> 
>> It is a method call for a transcript adaptor?
>> 
>> Best regards
>> 
>> Duarte
>> 
>> *From:* dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] *On 
>> Behalf Of *Duarte Molha
>> *Sent:* 25 May 2012 12:02
>> *To:* Ensembl developers list
>> *Subject:* Re: [ensembl-dev] Missing IDS in ENSEMBL database
>> 
>> Thanks Thibaut Hourlier
>> 
>> I just get very confused with all these IDS with the same format 
>> meaning different things!
>> 
>> Best regards
>> 
>> Duarte
>> 
>> *From:* dev-bounces at ensembl.org <mailto:dev-bounces at ensembl.org> 
>> [mailto:dev-bounces at ensembl.org] *On Behalf Of *Thibaut Hourlier
>> *Sent:* 25 May 2012 11:20
>> *To:* Ensembl developers list
>> *Subject:* Re: [ensembl-dev] Missing IDS in ENSEMBL database
>> 
>> Dear Duarte,
>> 
>> I went through the four first of your IDs:
>> 
>> On 25 May 2012, at 10:14, Duarte Molha wrote:
>> 
>> Dear Developers
>> 
>> I created a simple script to output the exons of specific transcripts 
>> with NM ids.
>> 
>> It works fine for all but a small list of IDS. The large majority of 
>> the failed IDS have been suppressed from NCBI
>> 
>> Because they were found to be a " /nonsense-mediated mRNA decay (NMD) 
>> candidate/" so I do not mind eliminating those records from my query.
>> 
>> / /
>> 
>> /However some of the ones that fail are in NCBI database and for some 
>> reason ENSEMBL is not able to query them:/
>> 
>> /NM/_001040409.1
>> 
>> It is an NMD transcript.
>> 
>> NM_001167607.1
>> 
>> It is an exon supporting feature and not a transcript supporting 
>> feature, i think this is the reason you don't get it with your script
>> 
>> http://www.ensembl.org/Homo_sapiens/Transcript/SupportingEvidence?db=c
>> ore;g=ENSG00000196743;r=5:150591711-150650001;t=ENST00000523466
>> 
>> NM_001199987.1
>> 
>> If you look in the Gene database at NCBI you will see that there is 2 
>> other sequences for NDUFB6, which are the transcript supporting 
>> feature for the 2 transcripts in Ensembl for the gene.
>> 
>> http://www.ncbi.nlm.nih.gov/gene/?term=NM_001199987.1
>> 
>> http://www.ensembl.org/Homo_sapiens/Transcript/SupportingEvidence?db=c
>> ore;g=ENSG00000165264;r=9:32552997-32573160;t=ENST00000379847
>> 
>> NM_001204090.1
>> 
>> Same problem as above, we did not use this sequence.
>> 
>> What you can do is to use the HGNC identifier of these failing IDs in 
>> the fetch_all_by_external_id method, i.e. NM_001199987.1 -> NDUFB6
>> 
>> Regards
>> 
>> Thibaut
>> 
>> NM_001242881.1
>> 
>> NM_014249.2
>> 
>> NM_015584.3
>> 
>> NM_024728.2
>> 
>> Can you tell me how to retrieve these from the database?
>> 
>> Here is the portion of my script I use to retrieve the data:
>> 
>> foreach my $query_transcript (@transcripts_of_interest) {
>> 
>> chomp $query_transcript;
>> 
>> my $transcript = "";
>> 
>> if ($query_transcript =~ /ENST/i){
>> 
>> $transcript =
>> $transcript_adaptor->fetch_by_stable_id("$query_transcript");
>> 
>> }
>> 
>> else{
>> 
>> ($transcript) = @{
>> $transcript_adaptor->fetch_all_by_external_name("$query_transcript");
>> 
>> }
>> 
>> unless ($transcript){
>> 
>> $progress->message("Query: $query_transcript failed");
>> 
>> next;
>> 
>> }
>> 
>> foreach my $exon ( @{ $transcript->get_all_Exons() } ) {
>> 
>> my $estring = feature2string($exon);
>> 
>> print "$query_transcript:\t$estring\n";
>> 
>> }
>> 
>> $next_update = $progress->update() if (++$j > $next_update);
>> 
>> }
>> 
>> Best regards
>> 
>> Duarte Molha
>> 
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org> List admin 
>> (including subscribe/unsubscribe):
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>> 
>> ----------------------------------------------------------------------
>> --
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): 
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list