[ensembl-dev] Ancestral alleles information

Thomas Walsh twalsh at ebi.ac.uk
Fri Aug 2 14:13:33 BST 2024


Hi Murillo,

Thanks for getting in touch and for your interest in the ancestral 
alleles data.

I'll try to address each of your questions in turn.

> How does Ensembl decide which species to call ancestral alleles for?

My understanding is that the original work on this was done as part of 
the 1000 Genomes Project, with the 6-Primates EPO being used to generate 
ancestral sequences, from which the ancestral alleles of human variants 
were then inferred. If you haven't seen it already, supplementary 
section 8.3 of the 1000 Genomes paper ( 
https://doi.org/10.1038/nature15393 ) provides some more detail.

The Primates EPO has expanded somewhat since then, and ancestral 
sequences have continued to be available for the Primates EPO species, 
with ancestral alleles being extracted from these.

Currently, high-coverage primate genome assemblies that are included in 
comparative analyses are generally included in the Primates EPO, and 
included in turn in the set of species with ancestral sequences.

> Can we request new species be added?

In general, there's no harm in asking. It may or may not be possible to 
facilitate such a request, but letting us know here or by contacting 
Ensembl helpdesk (helpdesk at ensembl.org) will at least help us get a 
sense of which species users are interested in.

In this particular case, it depends on the species you are interested in 
and on our capacity to add it. We are currently very constrained in 
terms of which species we can add to the Primates EPO, but there are a 
couple of species -- Marmoset (Callithrix jacchus) and Olive baboon 
(Papio anubis) -- which might be feasible to include in an upcoming 
release, as they are already involved in some comparative analyses but 
not currently in the Primates or Mammals EPO. Would you be interested in 
the ancestral sequences of either of these two species?

> Can I run the ancestral allele pipeline for my own species/EPO 
> alignment of choice?

There's no harm in trying. Results may vary depending on the species and 
EPO alignment, and on the phylogenetic context of the species within the 
EPO species tree. Intuitively, there would likely be a greater number of 
high-confidence ancestral allele calls for a species nestled among 
closely related or slowly evolving species, where the sister and 
ancestral sequences are more likely to be in agreement with each other.

Regards,

Thomas Walsh.

On 2024-07-30 02:01, Murillo Rodrigues wrote:

> Hi,
> 
> I noticed that Ensembl publishes ancestral alleles for a few species, 
> e.g. https://ftp.ensembl.org/pub/release-112/fasta/ancestral_alleles/
> 
> These are calculated based on EPO alignments that are built for subsets 
> of species.
> 
> How does Ensembl decide which species to call ancestral alleles for? 
> Can we request new species be added? Can I run the ancestral allele 
> pipeline for my own species/EPO alignment of choice?
> 
> Thank your for the help!
> 
> Murillo
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: 
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/

-- 

Thomas Walsh

Senior Bioinformatician, Ensembl Compara

European Bioinformatics Institute (EMBL-EBI)

Wellcome Genome Campus

Hinxton

Cambridge CB10 1SD

United Kingdom

Email: twalsh at ebi.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20240802/4178712e/attachment.html>


More information about the Dev mailing list