[ensembl-dev] Chimp alignment for Chromosome Y

Kathryn Beal kbeal at ebi.ac.uk
Tue Jan 31 11:29:59 GMT 2012


Hi,
The problem is due to previous versions of the human vs primate pairwise alignments being accidently hard-masked. All the human vs primate pairwise alignments have been rerun for the new release (e66) with soft-masking only.

Cheers
Kathryn


> Hi Yuan,
> I can confirm that these 2 regions look correct in the new release (e66):
> Region 1
> 
> homo_sapiens/1-1621    TATCTCCATGGGGTCATTTTTTTTCTTTTCTTTCTTTCTCTTTTTCTTTCTTTCTTCTATTTTTTTTTTTTTTTTTTTTTTGAGACAGGATCTTGCTCTG
> pan_troglodytes/1-1621 TATCTCCATGGGGTCATTTTTTTTCTTTTCTTTCTTTTTCTTTTTCTTTCTTTCTTCTATTTTTTTTTTTTTTTTTTT---GAGACAGGATCTTGCTCTG
>                        ************************************* ****************************************   *******************
> 
> homo_sapiens/1-1621    TCACCCAGGCTGGAGGGCAGTGGGATGATCACTGCTCACTGCAACCCTTCACTTCTCCAAGGCTTAGGTGATCCTCCCATGATGAGGTCAATTTTTAAAG
> pan_troglodytes/1-1621 TCACCCAGGCTGGAGGGCAGTGGGATGATCACTGCTCACTGAAACC-TTCACTTCTCCAAGGCTTAGGTGATCCTCCCATGATGAGGTCAATTTTTAAAG
>                        ***************************************** **** *****************************************************
> 
> Region 2
> homo_sapiens/1-21      TCAAGGGATCCCCAGGCTCAG
> pan_troglodytes/1-21   TCAAGGGATCCCCAGGCTCAG
>                        *********************
> 
> Thank you for bringing this to our attention.
> Cheers
> Kathryn
> 
>> Thanks Kathryn, it worked when using LASTZ_NET.
>> 
>> There is slightly problem when I checked for the two regions on release
>> 65:
>> region 1: Y:7387430-7387460, given human-chimp alignment :
>> 
>> Homo sapiens >  	chromosome:GRCh37:Y:7387430:7387460:1
>> Pan troglodytes >  	chromosome:CHIMP2.1.4:Y:24975139:24975169:-1
>> 
>> Homo sapiens     TTGCTCTGTCACCCAGGCTGGAGGGCAGTGG
>> Pan troglodytes  CTTGCTCTGTCACCCAGGCTGGAGGGCAGTG
>> 
>> there is 1 base shift.
>> 
>> region 2 : Y:13956063-13956083
>> 
>> Homo sapiens >  	chromosome:GRCh37:Y:13956063:13956083:1
>> Pan troglodytes >  	chromosome:CHIMP2.1.4:1:222316894:222316914:1
>> 
>> Homo sapiens     TCAAGGGATCCCCAGGCTCAG
>> Pan troglodytes  GGCTCAAGGGATCCCCAGGCT
>> 
>> there are 3 bases shift.
>> 
>> Hope this can be fixed next time.
>> 
>> yuan
>> 
>> On Fri, 27 Jan 2012 13:39:07 +0000, Kathryn Beal <kbeal at ebi.ac.uk> wrote:
>>> Hi Yuan,
>>> The human vs chimp alignments in e65 were run using lastz instead of
>>> blastz and hence the results have a method_link type of LASTZ_NET not
>>> BLASTZ_NET. 
>>> 
>>> Cheers
>>> Kathryn
>>> 
>>>> Hi ALL,
>>>> 
>>>> I am using Ensembl Compara to pull out chimp sequences for
>> corresponding
>>>> human bases, most of them are correct, but for a few of them, I got 
>>>> chimp bases that are different from human reference base, but UCSC has
>>>> the same base as human reference base :
>>>> 
>>>> Ensembl API version is 63
>>>> genomedb_name is homo_sapiens
>>>> genomedb_name is pan_troglodytes
>>>> alignment_type is BLASTZ_NET
>>>> chimp_slices is 1
>>>> num slice is 1
>>>> name of the slice is pan_troglodytes
>>>> ref_base is G and ref_pos is 2691796 and chimp base is A and chimp_pos
>>>> is 23773099 and 1
>>>> chimp_slices is 1
>>>> num slice is 1
>>>> name of the slice is pan_troglodytes
>>>> ref_base is G and ref_pos is 2750827 and chimp base is T and chimp_pos
>>>> is 23713497 and 2
>>>> chimp_slices is 1
>>>> num slice is 1
>>>> name of the slice is pan_troglodytes
>>>> ref_base is A and ref_pos is 2836431 and chimp base is T and chimp_pos
>>>> is 23626512 and 3
>>>> chimp_slices is 1
>>>> num slice is 1
>>>> .............. 
>>>> 
>>>> 
>>>> When I changed to run ensembl version 65, I got error message :
>>>> 
>>>> -------------------- WARNING ----------------------
>>>> MSG: No Bio::EnsEMBL::Compara::MethodLinkSpeciesSet found for
>>>> <BLASTZ_NET> and homo_sapiens(GRCh37), pan_troglodytes(CHIMP2.1.4)
>>>> FILE: Compara/DBSQL/MethodLinkSpeciesSetAdaptor.pm LINE: 681
>>>> CALLED BY: svn/modules/AncestralBase.pm  LINE: 122
>>>> Ensembl API version = 65
>>>> ---------------------------------------------------
>>>> 
>>>> -------------------- EXCEPTION --------------------
>>>> MSG: [] is not a Bio::EnsEMBL::Compara::MethodLinkSpeciesSet
>>>> STACK
>>>> 
>> Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor::fetch_by_Slice_MethodLinkSpeciesSet
>>>> 
>> /nfs/team19/yuan/ensembl-checkout/ensembl-compara-65/modules/Bio/EnsEMBL/Compara/DBSQL/AlignSliceAdaptor.pm:138
>>>> STACK AncestralBase::get_ancestral_allele_by_slices_from_db
>>>> /nfs/users/nfs_y/yuan/sanger/src/svn/modules/AncestralBase.pm:135
>>>> STACK main::get_base_from_pos_file ./get_ancestral_allele.pl:111
>>>> STACK toplevel ./get_ancestral_allele.pl:51
>>>> Ensembl API version = 65
>>>> ---------------------------------------------------
>>>> Ensembl API version is 65
>>>> genomedb_name is homo_sapiens
>>>> genomedb_name is pan_troglodytes
>>>> alignment_type is BLASTZ_NET
>>>> 
>>>> Any idea what went wrong please?
>>>> 
>>>> Thanks
>>>> 
>>>> yuan_______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> List admin (including subscribe/unsubscribe):
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>> 
>> 
>> -- 
>> The Wellcome Trust Sanger Institute is operated by Genome Research 
>> Limited, a charity registered in England with number 1021457 and a 
>> company registered in England with number 2742969, whose registered 
>> office is 215 Euston Road, London, NW1 2BE. 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120131/54836ab5/attachment.html>


More information about the Dev mailing list