[ensembl-dev] question about chain-net results in table 'genomic_align'
Javier Herrero
jherrero at ebi.ac.uk
Tue Jan 17 09:51:25 GMT 2012
Dear Zhang
Yes, the alignments can span an assembly gap. These are represented as
N's in the sequence, which is like a hard-masked sequence.
Could please explain when and how you get the exception about not
finding the sequence pieces?
Kind regards
Javier
On 04/12/11 13:11, Zhang Di wrote:
> Hi,
>
> Finally I got the compara pipeline for whole genome alignment to work.
>
> The results of RAW, CHAIN, and NET are all stored in table
> 'genomic_align' distinguished by different method_link_species_id.
> I found that the some records of CHAIN and NET, contain a few base
> pairs belong to gap region of its scaffold.
>
> e.g.
>
> mysql> select method_link_species_set_id, dnafrag_id,
> dnafrag_start, dnafrag_end from genomic_align where dnafrag_id =
> 4465 and dnafrag_start=486;
> +----------------------------+------------+---------------+-------------+
> | method_link_species_set_id | dnafrag_id | dnafrag_start |
> dnafrag_end |
> +----------------------------+------------+---------------+-------------+
> | 2 | 4465 | 486 |
> 567 |
> | 3 | 4465 | 486 |
> 567 |
> +----------------------------+------------+---------------+-------------+
>
>
> while for the dnafrag_id = 4465 , in my core database it is
> scaffold_2621 , seq_region_id = 429785:
>
> mysql> select * from assembly where asm_seq_region_id = 429785;
> +-------------------+-------------------+-----------+---------+-----------+---------+-----+
> | asm_seq_region_id | cmp_seq_region_id | asm_start | asm_end |
> cmp_start | cmp_end | ori |
> +-------------------+-------------------+-----------+---------+-----------+---------+-----+
> | 429785 | 181573 | 488 | 717 |
> 1 | 230 | -1 |
> | 429785 | 191688 | 1 | 419 |
> 1 | 419 | 1 |
> | 429785 | 220761 | 718 | 1086 |
> 1 | 369 | 1 |
> +-------------------+-------------------+-----------+---------+-----------+---------+-----+
>
>
>
> the 420 - 487 interval is a gap.
>
> Is this normal result of CHAIN-NET ?
>
> It is quite annoying because I want to use the compara_db for low
> coverage gene build, and It will complain:
>
> EXCEPTION:
> Could not find sequence-level pieces for scaffold_2621/486-744
>
> Best reguards
>
> --
> Zhang Di
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
--
Javier Herrero, PhD
Ensembl Coordinator and Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120117/07055f27/attachment.html>
More information about the Dev
mailing list