[ensembl-dev] Ensembl ID History Converter (IDmapper.pl) API Mapping Score Column

Lucas Swanson lswanson at bcgsc.ca
Thu Jun 7 17:18:19 BST 2012


Thanks Andy, that is great!

Though I am a little confused about some results where the score is "0". 
For example, with the example input included with the api (./IDmapper.pl 
-s human -f idmapper.in), one of the results is:

Old stable ID, New stable ID, Release, Mapping score
ENSG00000137361.1, ENSG00000137361.1, 3, 0
ENSG00000137361.1, ENSG00000137361.1, 6, 0
ENSG00000137361.1, ENSG00000137361.1, 7, 0
ENSG00000137361.1, ENSG00000137361.1, 10, 0
ENSG00000137361.1, ENSG00000137361.1, 14, 0
ENSG00000137361.1, ENSG00000137361.2, 15, 0
ENSG00000137361.2, ENSG00000137361.3, 18.1, 0
ENSG00000137361.3, ENSG00000137361.4, 18.2, 0
ENSG00000137361.4, ENSG00000137361.5, 21, 0
ENSG00000137361.5, ENSG00000189435.2, 26, 0.682482
ENSG00000137361.5, ENSG00000137361.6, 26, 0
ENSG00000137361.5, ENSG00000187888.2, 26, 0.630839
ENSG00000137361.5, ENSG00000187200.4, 26, 0.676841
ENSG00000137361.6, ENSG00000137361.6, 27, 0
ENSG00000137361.6, ENSG00000206461.1, 38, 0.946767
ENSG00000137361.6, ENSG00000197499.2, 38, 0.0765327
ENSG00000137361.6, <retired>, 38, 0
ENSG00000137361.6, ENSG00000206366.1, 38, 0.946767

Old stable ID, New stable ID, Release, Mapping score
ENSG00000137362.1, ENSG00000137362.1, 3, 0
ENSG00000137362.1, ENSG00000158651.1, 3, 0
ENSG00000137362.1, ENSG00000137362.1, 6, 0
ENSG00000137362.1, ENSG00000137362.1, 7, 0
ENSG00000137362.1, ENSG00000173466.1, 10, 0
ENSG00000137362.1, ENSG00000137362.1, 10, 0
ENSG00000137362.1, ENSG00000173460.1, 10, 0
ENSG00000137362.1, ENSG00000173459.1, 10, 0
ENSG00000137362.1, ENSG00000137362.1, 14, 0
ENSG00000137362.1, ENSG00000137362.2, 15, 0
ENSG00000137362.2, ENSG00000137362.3, 18.1, 0
ENSG00000137362.3, ENSG00000137362.3, 18.2, 0
ENSG00000137362.3, ENSG00000189298.1, 21, 0
ENSG00000137362.3, ENSG00000137362.4, 21, 0
ENSG00000137362.4, <retired>, 26, 0

On the first line, the mapping score between ENSG00000137361.1 and 
ENSG00000137361.1 (the exact same ID) in release 3 is... "0"? And EVERY 
mapping for ENSG00000137362.1 has a mapping score of "0"? I am not 
really certain why two IDs would be connected if they do not "map" to 
each other at all, according to the mapping score.

I am not sure if I have ever seen a mapping score of "1"... Is it 
perhaps the case that anywhere there is a mapping score of "0", it 
should really be interpreted as a mapping score of "1"?

~Thanks,
Lucas Swanson

Andy Yates wrote:
> Hi there Lucas,
>
> I'm sorry I missed answering this question in your original email. So the quick answer is the closer to 1 you are the better the mapping is. A mapping score of 1 means a 100% match.
>
> As for how it's calculated it is generated from two sources; location based mapping and alignment based mapping on Exons. Location's score is based on the overlap two exons posses. Alignment's score is derived from an exonerate alignment where the score is (2 * match_length / (source_length + target_length)) with the ability to further modify this score depending on various other conditions (we prefer mappings which are located on the same sequence region). Once all scores have been generated we merge the two sets of results together for each exon pair which could be mapped and retain the highest score. Transcripts and Genes have their scores built from these Exons scores.
>
> Hope this helps,
>
> Andy
>
> Andrew Yates                   Ensembl Core Software Project Leader
> EMBL-EBI                       Tel: +44-(0)1223-492538
> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
> Cambridge CB10 1SD, UK         http://www.ensembl.org/
>
> On 2 Jun 2012, at 00:52, Lucas Swanson wrote:
>
>   
>> Hello,
>>
>> I am using the IDmapper.pl ID history converter to update some old gene IDs to the corresponding gene IDs in the current release.
>>
>> I am uncertain about the "Mapping score" column in the API output. What does it represent/how is it calculated, and is a higher number better, or is a lower number better?
>>
>> ~Thank you,
>> Lucas Swanson
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>     
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>   





More information about the Dev mailing list