[ensembl-dev] Centromere regions

Henrikki Almusa henrikki.almusa at helsinki.fi
Tue Feb 14 14:57:06 GMT 2012


On 2012-02-14 13:59, Monika Komorowska wrote:
> Hi Henrikki
>
> You can use the fetch_all_by_chr_name method in
> Bio::EnsEMBL::DBSQL::KaryotypeBandAdaptor to get all KaryotypeBand
> objects for a chromosome and iterate through the objects until you get 2
> objects with stain = 'acent'. Their coordinates will give you the
> location of a chromosome's centromere.
>
> More information on the above objects can be found in the Core API
> documentation:
>
> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1KaryotypeBand.html

Great, this is exactly what I need. Thanks.

> This is an example MySQL query to get the centromere for chromosome X
>
> select seq_region_start, seq_region_end from karyotype k inner join
> seq_region sr on k.seq_region_id = sr.seq_region_id inner join
> coord_system cs on sr.coord_system_id = cs.coord_system_id where cs.name
> <http://cs.name/> = 'chromosome' and cs.version = 'GRCh37' and sr.name
> <http://sr.name/> = 'X' and stain = 'acen' order by seq_region_start;
>
> +------------------+----------------+
> | seq_region_start | seq_region_end |
> +------------------+----------------+
> | 58100001 | 60600000 |
> | 60600001 | 63000000 |
> +------------------+----------------+
> 2 rows in set (0.00 sec)
>
> the seq_region_start in the first row is the start co-ordinate of the
> centromere (58100001), the seq_region_end in the 2 row is the end
> co-ordinate (63000000)

Just to be sure. I can assume that for each chromosome will get two 
rows, right?

Thanks,

> Hope this helps
>
> Monika
>
> On 14 Feb 2012, at 11:27, Daniel Lawson wrote:
>
>> Dear Henrikki,
>>
>> Centromeres are amongst the hardest part of a genome to sequence and
>> assemble as they tend to be highly repetitive. I do not have personal
>> knowledge of the availability of centromeres in the vertebrate
>> assemblies but my expectation is that they will be poorly represented.
>> Someone from the genebuild team or helpdesk may be able to provide
>> more information.
>>
>> regards
>> Dan
>>
>> On 14 February 2012 10:18, Henrikki Almusa
>> <henrikki.almusa at helsinki.fi <mailto:henrikki.almusa at helsinki.fi>> wrote:
>>
>>     Hi all,
>>
>>     I would like to retrieve centromere areas for ensembl genomes, but
>>     can't seem to find anything how they are marked in database. I
>>     will use perl api to retrieve them from local copy. How are these
>>     marked in the database?
>>
>>     Regards,
>>     --
>>     Henrikki Almusa
>>
>>     _________________________________________________
>>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>     List admin (including subscribe/unsubscribe):
>>     http://lists.ensembl.org/__mailman/listinfo/dev
>>     <http://lists.ensembl.org/mailman/listinfo/dev>
>>     Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
>> --
>> Ensembl Genomes | VectorBase | i5K insect genome initiative
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> List admin (including subscribe/unsubscribe):
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> Monika Komorowska
> EnsEMBL Software Developer
>
> European Bioinformatics Institute (EMBL-EBI)
> tel: +44(0) 1233 494 409
>


-- 
Henrikki Almusa




More information about the Dev mailing list