[ensembl-dev] Centromere regions

Monika Komorowska monika at ebi.ac.uk
Tue Feb 14 11:59:45 GMT 2012


Hi Henrikki

You can use the fetch_all_by_chr_name method in Bio::EnsEMBL::DBSQL::KaryotypeBandAdaptor to get all KaryotypeBand objects for a chromosome and iterate through the objects until you get 2 objects with stain = 'acent'. Their coordinates will give you the location of a chromosome's centromere. 

More information on the above objects can be found in the Core API documentation:

http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1KaryotypeBand.html



This is an example MySQL query to get the centromere for chromosome X

select seq_region_start, seq_region_end from karyotype k inner join seq_region sr on k.seq_region_id = sr.seq_region_id inner join coord_system cs on sr.coord_system_id = cs.coord_system_id  where cs.name = 'chromosome' and cs.version = 'GRCh37' and sr.name = 'X' and stain = 'acen' order by seq_region_start;

+------------------+----------------+
| seq_region_start | seq_region_end |
+------------------+----------------+
|         58100001 |       60600000 | 
|         60600001 |       63000000 | 
+------------------+----------------+
2 rows in set (0.00 sec)

the seq_region_start in the first row is the start co-ordinate of the centromere (58100001), the seq_region_end in the 2 row is the end co-ordinate (63000000)


Hope this helps

Monika

On 14 Feb 2012, at 11:27, Daniel Lawson wrote:

> Dear Henrikki,
> 
> Centromeres are amongst the hardest part of a genome to sequence and assemble as they tend to be highly repetitive. I do not have personal knowledge of the availability of centromeres in the vertebrate assemblies but my expectation is that they will be poorly represented. Someone from the genebuild team or helpdesk may be able to provide more information.
> 
> regards
> Dan
> 
> On 14 February 2012 10:18, Henrikki Almusa <henrikki.almusa at helsinki.fi> wrote:
> Hi all,
> 
> I would like to retrieve centromere areas for ensembl genomes, but can't seem to find anything how they are marked in database. I will use perl api to retrieve them from local copy. How are these marked in the database?
> 
> Regards,
> -- 
> Henrikki Almusa
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> 
> 
> -- 
> Ensembl Genomes | VectorBase | i5K insect genome initiative
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

Monika Komorowska
EnsEMBL Software Developer

European Bioinformatics Institute (EMBL-EBI)
tel: +44(0) 1233 494 409

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120214/066a6e87/attachment.html>


More information about the Dev mailing list