[ensembl-dev] getting exons from database directly

Andrea Edwards edwardsa at cs.man.ac.uk
Fri May 6 16:22:12 BST 2011


I tried 2 ways :

===============================================

my $gene_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core', 'Gene' );
my $genes = $gene_adaptor->fetch_all();

my $exon_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core', 'Exon' );
$total_genes=0;
$exon_count = 0;
foreach $gene(@{$genes}) {
     $total_genes++;

     foreach $exon ($gene->get_all_Exons()) {
       $exon_count++;
     }
} #end for each gene


=============================================

This way gave even less (23k) but i'm being stricter here about the 
chromosomes

@slices = @{ $slice_adaptor->fetch_all('chromosome', undef, 0, 1) };

$total_genes=0;
$exon_count = 0;
foreach $slice (@slices) {
     unless ($slice->seq_region_name() =~ /Un/) {
         print $slice->seq_region_name."\n";
         my $genes = $gene_adaptor->fetch_all_by_Slice($slice);


         foreach my $gene(@{$genes}) {
             $total_genes++;

             foreach my $exon ($gene->get_all_Exons()) {
                   $exon_count++;
                   print "$exon_count\n";
             }




         } #end for each gene
     }
}

==============================================

But neither give anything like the sql results

Why does the sql give so many more? Which should I use?

thank you


On 06/05/11 15:50, Bert Overduin wrote:
> Hi Andrea,
>
> I suspect that your BioMart results are truncated because the query is 
> too large.
>
> However, that doesn't explain your API results .... How does your API 
> code look like?
>
> Cheers,
> Bert
>
> On Fri, May 6, 2011 at 3:45 PM, Andrea Edwards <edwardsa at cs.man.ac.uk 
> <mailto:edwardsa at cs.man.ac.uk>> wrote:
>
>     Hello
>
>     I'm sorry for the basic question but I was looking at the ensembl
>     core schema and trying to retrieve just the exons on chromosomes
>     and couldn't work out why i am getting such different figures than
>     with biomart and the perl api
>
>     For example for cow there are 25670 exons in genes with biomart
>     and the api but with this sql  ~210k exons. This code is just
>     looking for exons on chromosomes 1-30 and X
>
>     select count(distinct stable_id) from exon e inner join
>     exon_stable_id es using(exon_id) inner join seq_region sr
>     using(seq_region_id) where sr.coord_system_id = 2 and sr.name
>     <http://sr.name> REGEXP '^[1-9]|^X'  and e.is_current=1
>
>     I get 8k just on chromosome 1
>
>     I'm sure this is simple and perhaps its because its Friday
>     afternoon but I'm just not seeing it!!
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     List admin (including subscribe/unsubscribe):
>     http://lists.ensembl.org/mailman/listinfo/dev
>     Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> -- 
> Bert Overduin, Ph.D.
> Vertebrate Genomics Team
>
> EMBL - European Bioinformatics Institute
> Wellcome Trust Genome Campus
> Hinxton, Cambridge CB10 1SD
> United Kingdom
>
> http://www.ebi.ac.uk/~bert <http://www.ebi.ac.uk/%7Ebert>
>
> Ensembl browser: http://www.ensembl.org <http://www.ensembl.org/>
>
> Mailing lists: http://www.ensembl.org/info/about/contact/mailing.html
>
> Blog: http://www.ensembl.info <http://www.ensembl.info/>
>
> YouTube: http://www.youtube.com/user/EnsemblHelpdesk
> Facebook: http://www.facebook.com/Ensembl.org
> Twitter: http://twitter.com/Ensembl
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110506/c02031b5/attachment.html>


More information about the Dev mailing list