[ensembl-dev] transform swissprot protein coordinates into genome coordinates

Andy Yates ayates at ebi.ac.uk
Mon Sep 3 16:09:35 BST 2012


Hi Stephane,

Just to make you aware the previously described technique will only work when we have a 1:1 mapping between the Ensembl translation & UniProtKB entry. You can verify this a few ways:

1). Check the DBEntry and if we assigned the ID based on alignment then you can verify the identity is 100% in both Ensembl and UniProtKB protein. We can assign UP accessions based on direct associations which will not have an identity value but are 1:1

2). Use an external resource like UniParc which does 1:1 mappings between different databases such as UniProtKB and Ensembl e.g.

http://www.uniprot.org/uniparc/UPI000000BC7D

HTH,

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 31 Aug 2012, at 15:39, Stéphane Plaisance wrote:

> Dear All,
> 
> I have downloaded the PFAM full list for mm9 and would like to obtain the genomic coordinates for each record (from cols 2+3 below )
> 
> Is there someone with recyclable code for doing so?
> 
> My input is in Swissprot format (relative to ATG in the relative Acc as shown below
>> #Pfam-A regions from Pfam version 26.0 for ncbi taxid 10090 'Mus musculus (strain C57BL/6)'						
>> #Total number of proteins in proteome: 47211										
>> #<seq id> <alignment start> <alignment end> <envelope start> <envelope end> <hmm acc> <hmm name> <type> <hmm start> <hmm end> <hmm length> <bit score> <E-value> <clan>
>> Q9JKB1	5	215	5	216	PF01088	Peptidase_C12	Domain	1	213	214	271.10	4.6e-78	CL0125
>> E9Q751	46	129	46	129	PF04822	Takusan	Family	1	84	84	96.90	4.5e-25	No_clan
>> D3YVR1	2	161	2	161	PF00743	FMO-like	Family	1	160	532	325.60	4.1e-94	CL0063
>> B1AQR8	223	351	223	352	PF00337	Gal-bind_lectin	Domain	1	132	133	134.30	1.6e-36	CL0004
> 
> this looks feasible and probably needs some external ref and slice magic, I have not used this for several years now and am a bit rusted
> 
> Thanks for any piece of code to start with
> 
> Stephane
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list