[ensembl-dev] List of chromosomes from MySQL

Toni Hermoso Pulido toni.hermoso at crg.cat
Thu Jun 7 14:37:15 BST 2012


Thanks!
Yes, I see I can get it from here:
http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/public-plugins/ensembl/conf/ini-files/?root=ensembl

2012/6/7 Anne Parker <ap5 at sanger.ac.uk>:
>
> On 7 Jun 2012, at 14:02, Toni Hermoso Pulido wrote:
>
>> Thanks Javier,
>>
>> since there are certain chromosomes like this:
>> http://www.ensembl.org/Saccharomyces_cerevisiae/Location/Genome
>> and to a certain extent this:
>> http://www.ensembl.org/Drosophila_melanogaster/Location/Genome
>> with Roman numbers or extra letters, if no sorting index is kept
>> anywhere, I guess I should keep a derived manual order list somewhere
>> else.
>> But, maybe for generating the interactive image in the karyotype... is
>> that info in the code of ENSEMBL website?
>
> That's correct - the .ini file for each species includes an arrayref of "drawable" chromosomes.
>
>>
>> 2012/6/7 Javier Herrero <jherrero at ebi.ac.uk>:
>>> Hi Toni
>>>
>>> One way to get the list of all the chromosomes (especially if you want to
>>> work with different genomes) is to query the compara database.
>>>
>>> select dnafrag.name from genome_db join dnafrag using (genome_db_id) where
>>> coord_system_name = "chromosome" and genome_db.name = "homo_sapiens" and
>>> is_reference = 1;
>>>
>>> Natural sorting in MySQL is not quite straightforward. One way around in
>>> this case is to get the chromosomes starting with a number first and then
>>> run another query for the other chromosomes:
>>>
>>> select dnafrag.name from genome_db join dnafrag using (genome_db_id) where
>>> coord_system_name = "chromosome" and genome_db.name = "homo_sapiens" and
>>> is_reference = 1 and dnafrag.name rlike "[1-9]" order by (dnafrag.name + 0);
>>>
>>> select dnafrag.name from genome_db join dnafrag using (genome_db_id) where
>>> coord_system_name = "chromosome" and genome_db.name = "homo_sapiens" and
>>> is_reference = 1 and dnafrag.name not rlike "[1-9]" order by (dnafrag.name);
>>>
>>> If you prefer to have the MT chromosome at the end, you could also sort by
>>> the length of the chromosome name, but that won't necessarily give you the
>>> expected result for all the genomes.
>>>
>>> As a general rule, you would be better off by getting all the chromosome
>>> names using the SQL query and sorting them afterwards you preferred
>>> scripting language.
>>>
>>> Kind regards
>>>
>>> Javier
>>>
>>>
>>> On 07/06/12 12:04, Toni Hermoso Pulido wrote:
>>>>
>>>> Hello,
>>>>
>>>> sorry if a silly question. I was wondering whether there is any way to
>>>> get a list of chromosomes for every species (with their canonical
>>>> order, if not numbers, and naming as in karyotypes*) from ENSEMBL
>>>> MySQL DB?
>>>> I played a bit with coord_system and karyotype tables, but I don't
>>>> have a clear idea, for instance, for omitting haplotypes and getting
>>>> an actual list.
>>>>
>>>> Thanks in advance,
>>>>
>>>> * Like here: http://www.ensembl.org/Homo_sapiens/Location/Genome




More information about the Dev mailing list