[ensembl-dev] getting exons from database directly
Andrea Edwards
edwardsa at cs.man.ac.uk
Fri May 6 16:22:12 BST 2011
I tried 2 ways :
===============================================
my $gene_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core', 'Gene' );
my $genes = $gene_adaptor->fetch_all();
my $exon_adaptor = $registry->get_adaptor( 'bos_taurus', 'Core', 'Exon' );
$total_genes=0;
$exon_count = 0;
foreach $gene(@{$genes}) {
$total_genes++;
foreach $exon ($gene->get_all_Exons()) {
$exon_count++;
}
} #end for each gene
=============================================
This way gave even less (23k) but i'm being stricter here about the
chromosomes
@slices = @{ $slice_adaptor->fetch_all('chromosome', undef, 0, 1) };
$total_genes=0;
$exon_count = 0;
foreach $slice (@slices) {
unless ($slice->seq_region_name() =~ /Un/) {
print $slice->seq_region_name."\n";
my $genes = $gene_adaptor->fetch_all_by_Slice($slice);
foreach my $gene(@{$genes}) {
$total_genes++;
foreach my $exon ($gene->get_all_Exons()) {
$exon_count++;
print "$exon_count\n";
}
} #end for each gene
}
}
==============================================
But neither give anything like the sql results
Why does the sql give so many more? Which should I use?
thank you
On 06/05/11 15:50, Bert Overduin wrote:
> Hi Andrea,
>
> I suspect that your BioMart results are truncated because the query is
> too large.
>
> However, that doesn't explain your API results .... How does your API
> code look like?
>
> Cheers,
> Bert
>
> On Fri, May 6, 2011 at 3:45 PM, Andrea Edwards <edwardsa at cs.man.ac.uk
> <mailto:edwardsa at cs.man.ac.uk>> wrote:
>
> Hello
>
> I'm sorry for the basic question but I was looking at the ensembl
> core schema and trying to retrieve just the exons on chromosomes
> and couldn't work out why i am getting such different figures than
> with biomart and the perl api
>
> For example for cow there are 25670 exons in genes with biomart
> and the api but with this sql ~210k exons. This code is just
> looking for exons on chromosomes 1-30 and X
>
> select count(distinct stable_id) from exon e inner join
> exon_stable_id es using(exon_id) inner join seq_region sr
> using(seq_region_id) where sr.coord_system_id = 2 and sr.name
> <http://sr.name> REGEXP '^[1-9]|^X' and e.is_current=1
>
> I get 8k just on chromosome 1
>
> I'm sure this is simple and perhaps its because its Friday
> afternoon but I'm just not seeing it!!
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> --
> Bert Overduin, Ph.D.
> Vertebrate Genomics Team
>
> EMBL - European Bioinformatics Institute
> Wellcome Trust Genome Campus
> Hinxton, Cambridge CB10 1SD
> United Kingdom
>
> http://www.ebi.ac.uk/~bert <http://www.ebi.ac.uk/%7Ebert>
>
> Ensembl browser: http://www.ensembl.org <http://www.ensembl.org/>
>
> Mailing lists: http://www.ensembl.org/info/about/contact/mailing.html
>
> Blog: http://www.ensembl.info <http://www.ensembl.info/>
>
> YouTube: http://www.youtube.com/user/EnsemblHelpdesk
> Facebook: http://www.facebook.com/Ensembl.org
> Twitter: http://twitter.com/Ensembl
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110506/c02031b5/attachment.html>
More information about the Dev
mailing list