[ensembl-dev] List of chromosomes from MySQL

Javier Herrero jherrero at ebi.ac.uk
Thu Jun 7 12:31:39 BST 2012


Hi Toni

One way to get the list of all the chromosomes (especially if you want 
to work with different genomes) is to query the compara database.

select dnafrag.name from genome_db join dnafrag using (genome_db_id) 
where coord_system_name = "chromosome" and genome_db.name = 
"homo_sapiens" and is_reference = 1;

Natural sorting in MySQL is not quite straightforward. One way around in 
this case is to get the chromosomes starting with a number first and 
then run another query for the other chromosomes:

select dnafrag.name from genome_db join dnafrag using (genome_db_id) 
where coord_system_name = "chromosome" and genome_db.name = 
"homo_sapiens" and is_reference = 1 and dnafrag.name rlike "[1-9]" order 
by (dnafrag.name + 0);

select dnafrag.name from genome_db join dnafrag using (genome_db_id) 
where coord_system_name = "chromosome" and genome_db.name = 
"homo_sapiens" and is_reference = 1 and dnafrag.name not rlike "[1-9]" 
order by (dnafrag.name);

If you prefer to have the MT chromosome at the end, you could also sort 
by the length of the chromosome name, but that won't necessarily give 
you the expected result for all the genomes.

As a general rule, you would be better off by getting all the chromosome 
names using the SQL query and sorting them afterwards you preferred 
scripting language.

Kind regards

Javier

On 07/06/12 12:04, Toni Hermoso Pulido wrote:
> Hello,
>
> sorry if a silly question. I was wondering whether there is any way to
> get a list of chromosomes for every species (with their canonical
> order, if not numbers, and naming as in karyotypes*) from ENSEMBL
> MySQL DB?
> I played a bit with coord_system and karyotype tables, but I don't
> have a clear idea, for instance, for omitting haplotypes and getting
> an actual list.
>
> Thanks in advance,
>
> * Like here: http://www.ensembl.org/Homo_sapiens/Location/Genome
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>

-- 
Javier Herrero, PhD
Ensembl Coordinator and Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK





More information about the Dev mailing list