[ensembl-dev] FW: Question on compara gene trees

Kumari, Sunita kumari at cshl.edu
Fri Nov 15 20:26:19 GMT 2013


Hi Ensembl team,

I will really appreciate if someone can answer my questions quickly. 

I did not get any response so far. I am not sure even if you are getting my emails. 

Thanks much.

Sunita




========================

From: Kumari, Sunita
Sent: Thursday, November 14, 2013 3:47 PM
To: dev at ensembl.org
Subject: quick questions for gene trees

Hi Ensembl compara team,

I am using this ensemble ftp site to get alignment files and gene
trees in newick format:

ftp://ftp.ensemblgenomes.org/pub/plants/release-20/emf/ensembl-compara/homologies/

I am using  Compara.gene_trees.20.emf.gz and Compara.newinck_trees.20.emf.gz files

I have couple of questions. I would appreciate if you can please provide me some information.

1. metadata information on gene trees:

a) Are the trees outgroup OR midpoint rooted?

b) The branch length unit is replacements per position, arbitrary
units or million years?

c) Tree style is cladogram, phylogram, or phenogram?

d) bootstrap type is felsenstein 1985, aLRT SH-like branch support, or
bayesian posterior probability?


2. For alignments (Compara.gene_trees.20.emf.gz):

Where can I get the alignment ID, i.e. the 'source DB alignment ID'?
e.g. What is the unique identifier for the alignment at the source
database?


3. InParanoid7 provides scoring values to orthologs. e.g.
http://inparanoid.sbc.su.se/cgi-bin/e.cgi?species1=93&species2=98&clusters_per_page=50&.submit=Submit+Query&clusterlowerlimit=1

Do we also provide scoring value to orthologs using Compara pipeline?
If not, any plan to provide this value in next release?

Looking forward to your reply.

Thanks.

Sunita
________________________________________

Sunita Kumari, PhD
Bioinformatics Scientist,
Ware Lab,
Cold Spring Harbor Labs,
Cold Spring Harbor, NY -11724

________________________________________
From: Kumari, Sunita
Sent: Tuesday, November 12, 2013 3:37 PM
To: dev at ensembl.org
Subject: Question on compara gene trees

Dear Ensembl compara team,


I have couple of questions on metadata for gene trees. I am using this ensemble ftp site to get alignment files and gene trees in newick format:
ftp://ftp.ensemblgenomes.org/pub/plants/release-20/emf/ensembl-compara/homologies/

Q1.  For each tree, can we get the following information; pl confirm the answer given below each comment.

a) If the tree is Outgroup_OR_Midpoint rooted;
-----Probably Outgroup

b) branch_length        unit is "Replacements per position" OR "Arbitrary units" OR "Million years";
---Probably arbitrary

c) tree style is "Cladogram" OR "Phylogram" OR "Phenogram";
-- Phylogram

d) bootstrap_type       is "Felsenstein 1985" OR "aLRT SH-like branch support" OR "Bayesian posterior probability"

please provide the correct bootstrap type.


Q2. Is it possible to get conservation score in next compara release for Ensembl plant genomes?
What will be the probable timeline to get scoring available?


Thanks.

Sunita

Sunita Kumari, PhD
Bioinformatics Scientist,
Ware Lab,
Cold Spring Harbor Labs,
Cold Spring Harbor, NY - 11724




More information about the Dev mailing list