[ensembl-dev] transform swissprot protein coordinates into genome coordinates

Stéphane Plaisance stephane.plaisance at vib.be
Fri Aug 31 15:39:36 BST 2012


Dear All,

I have downloaded the PFAM full list for mm9 and would like to obtain the genomic coordinates for each record (from cols 2+3 below )

Is there someone with recyclable code for doing so?

My input is in Swissprot format (relative to ATG in the relative Acc as shown below
> #Pfam-A regions from Pfam version 26.0 for ncbi taxid 10090 'Mus musculus (strain C57BL/6)'						
> #Total number of proteins in proteome: 47211										
> #<seq id> <alignment start> <alignment end> <envelope start> <envelope end> <hmm acc> <hmm name> <type> <hmm start> <hmm end> <hmm length> <bit score> <E-value> <clan>
> Q9JKB1	5	215	5	216	PF01088	Peptidase_C12	Domain	1	213	214	271.10	4.6e-78	CL0125
> E9Q751	46	129	46	129	PF04822	Takusan	Family	1	84	84	96.90	4.5e-25	No_clan
> D3YVR1	2	161	2	161	PF00743	FMO-like	Family	1	160	532	325.60	4.1e-94	CL0063
> B1AQR8	223	351	223	352	PF00337	Gal-bind_lectin	Domain	1	132	133	134.30	1.6e-36	CL0004

this looks feasible and probably needs some external ref and slice magic, I have not used this for several years now and am a bit rusted

Thanks for any piece of code to start with

Stephane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120831/43ec2758/attachment.html>


More information about the Dev mailing list