[ensembl-dev] Question Regarding Bio::EnsEMBL::Mapper method fastmap.

Andy Yates ayates at ebi.ac.uk
Thu Apr 26 11:29:42 BST 2012


Hi Will,

As you suspected Ensembl does something slightly differently here. We place our features on the top-level; you can see this from the meta table where we see the %build.level% keys. When this level is different to the one you have retrieved a feature for that's when we start the remapping process. Since our features are placed on the top-level we do not see these issues when a contig has been mapped to more than one sequence region in a coordinate system as we a look at the issue top down not bottom up.

The API can deal with these situations but you cannot rely on it automatically figuring this out. All features have a project_to_slice() method where you can give the target slice you want the feature projected to. This does mean you have to find the Slices your contig in question will project to; filter them where $slice->is_reference() is true and then give this slice to the project_to_slice() method.

I would also change the dna_align_featurebuild.level key in your meta table to the coordinate system level your align features are on (probably contig or seqlevel). That way you avoid the API attempt to remap to the top-level.

Hope some of this has helped you out,

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 25 Apr 2012, at 12:26, Will Chow wrote:

> Hi Andy,
> 
> Much thanks for your explanation, it is very helpful.
> 
> I am exactly in that situation.  I have features on the contig(sequence) level, and need to project to top level, in which the contig assembles to a chromosome and a patch.  I have been getting differing results in terms of features fetched, so I have been following through all the code you mention and isolated down to the fastmap method.  Perhaps there is something in the schema(meta table) I am missing, assembly exceptions table should be nearly identical to what is seen on the core database, but I should double check.
> 
> Are the features involved in a patch region, in the core db, mapped on the top level, i.e. different from my situation where things will be projected?
> 
> again thanks for your help.
> 
> Will 
> 
> On 25 Apr 2012, at 12:04, Andy Yates wrote:
> 
>> Hi Will,
>> 
>> The code you have pointed out will only be executed if we are projecting between coordinate systems. DnaAlignFeatureAdaptor::_objs_from_sth() requires the mapper be passed into it from BaseAdaptor::generic_fetch(). This does happen in BaseFeatureAdaptor::_slice_fetch() but only when a feature's coordinate system is not the same as the querying slice's coordinate systems. The schema stores patches/haplotypes as assembly exceptions and therefore are held as mappings between the reference chromosome and the exception. We do not store the relationship of say chromosome 21 from NCBI36 to HSCHR21_2_CTG1_1 in GRCh37.p6 (a coordinate system mapping) so there are no multiple mappings.
>> 
>> Hope this helps,
>> 
>> Andy
>> 
>> Andrew Yates                   Ensembl Core Software Project Leader
>> EMBL-EBI                       Tel: +44-(0)1223-492538
>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>> Cambridge CB10 1SD, UK         http://www.ensembl.org/
>> 
>> On 25 Apr 2012, at 11:27, Will Chow wrote:
>> 
>>> Hi dev'ers,
>>> 
>>> Regarding Bio::EnsEMBL::Mapper::fastmap method and its use in the DnaAlignFeatureAdaptor.pm.
>>> 
>>> from what I gather, the use is to return seq_region_id, start, end, strand information of one coordinate system from information of another coord sys (i.e. project).  However fastmap seems to only return one set of information, as seen in the code.
>>> 
>>> I was wondering if there are multiple mappings (like the human patches), which set of information (seq_region_id) will be returned?
>>> 
>>> perhaps I'm missing something else from the code, which explains this, if so perhaps you can point me to this.
>>> 
>>> much thanks
>>> 
>>> Will
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list