[ensembl-dev] Source of RefSeq annotations used in VEP

Will McLaren wm2 at ebi.ac.uk
Thu Jun 15 09:09:41 BST 2017


Hi John

The RefSeq GFF files are imported into Ensembl's otherfeatures databases
directly, which is then used by VEP to create RefSeq cache files [1].

By default Ensembl and VEP does not account for any potential sequence
differences between RefSeq sequences and the genome, although in human
GRCh38 the existence of differences is flagged [2]. However, VEP does have
a fairly new feature whereby RefSeq models can be corrected on the fly by
NCBI-issued BAM files [3].

Hope that helps

Will McLaren
Ensembl Variation

[1] : http://www.ensembl.org/Help/Faq?id=294
[2] :
http://www.ensembl.org/info/docs/tools/vep/vep_formats.html#refseq_match
[3] :
http://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#refseq

On 14 June 2017 at 16:26, John M.C. Ma <manchunjohn-ma at uiowa.edu> wrote:

> Hi all,
>
> Currently, there's an option of using RefSeq annotations for 17
> species. While the info.txt file from the cache files in question do
> list the source GFF annotation (for example, for
> homo_sapiens_refseq_vep_89_GRCh38.tar.gz the source annotation is
> listed as GCF_000001405.34_GRCh38.p8_genomic.gff), does Ensembl uses
> the annotations verbatim, or do you perform extra steps to map the
> features in question?
>
> Thanks for your answer in response.
>
> Best regards,
> John MC Ma
> Department of Lymphoma/Myeloma research
> UT MD Anderson Cancer Center
> Houston, Texas
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170615/f5b715b4/attachment.html>


More information about the Dev mailing list