[ensembl-dev] different between eGenetics and GNF/Atlas

cy_jiang cy_jiang at 126.com
Thu Jul 7 10:22:19 BST 2016


Do you mean this ?





However, when I look into the MT sequence in path/to/77_GRCh37/Homo_sapiens.GRCh37.77.dna.primary_assembly.fa, the length of this MT sequence is 16569bp.
While the length of MT sequence in hg19 is 16572bp.

Emma


At 2016-07-07 16:53:40, "Will McLaren" <wm2 at ebi.ac.uk> wrote:

Yes, you should make sure always that you match the assembly used to call your variants with the assembly version in VEP.


Typically you will find this information in the header of the VCF generated by your variant caller.


Will


On 7 July 2016 at 09:45, cy_jiang <cy_jiang at 126.com> wrote:

Thank you very much for your quick response!
Does it mean that other chromosome may exists the same problem if the reference we use is GRCh37 and VEP version 77?

Thanks,
Emma







At 2016-07-07 16:36:33, "Will McLaren" <wm2 at ebi.ac.uk> wrote:

Ensembl, and VEP, switched to GRCh38 by default from release 76 onwards.


GRCh37 is still available as a database connection or cache download, but the default is to use GRCh38.


Regards


Will


On 7 July 2016 at 09:13, cy_jiang <cy_jiang at 126.com> wrote:

Hi Will,
I am also interested in this problem.


It seems that this is based on GRCh38.

$ echo "MT 4249 . C T" | perl variant_effect_predictor.pl -data -force -pick -check_ref -fields HGVSc,HGVSp -o stdout -hgvs | grep -v #
ENST00000361390.2:c.943C>T      ENSP00000354687.2:p.Pro315Ser

I look into this position(chrM:4249) in UCSC Genome Browser on Human Feb. 2009 (GRCh37/hg19) Assembly. It turns out that the base at this position is T not C.
When changing it to GRCh38, it changes to C.

As far as I am can remember, VEP version 77 is based on GRCh37. Did I miss something there?

Emma






At 2016-07-07 10:02:52, "林琼芬" <qiongfen0 at gmail.com> wrote:

Dear Will,
Thank you so much. It do help me a lot, I have find out the bug and trying to solve, thanks!


Best Regard!
Lin




2016-07-06 18:23 GMT+08:00 Will McLaren <wm2 at ebi.ac.uk>:

The reference allele is incorrect in your input. If you specify --check_ref, VEP will warn you that the allele is wrong (and tell you what the correct allele is) and ignore that input:


$ echo "MT 4249 . T C" | perl variant_effect_predictor.pl --database --force --check_ref
2016-07-06 11:18:22 - Reading input from STDIN (or maybe you forgot to specify an input file?)...
2016-07-06 11:18:22 - Starting...
2016-07-06 11:18:22 - Detected format of input file as vcf


WARNING: Specified reference allele T does not match Ensembl reference allele C on line 1
2016-07-06 11:18:22 - Wrote stats summary to variant_effect_output.txt_summary.html
2016-07-06 11:18:22 - See variant_effect_output.txt_warnings.txt for details of 1 warnings
2016-07-06 11:18:22 - Finished!


Here's the HGVS output with the correct input (assuming that the ref/alts are switched):


$ echo "MT 4249 . C T" | perl variant_effect_predictor.pl -data -force -pick -check_ref -fields HGVSc,HGVSp -o stdout -hgvs | grep -v #
ENST00000361390.2:c.943C>T      ENSP00000354687.2:p.Pro315Ser


Regards


Will


On 6 July 2016 at 10:38, 林琼芬 <qiongfen0 at gmail.com> wrote:

Hello Will,
Thanks for you help. I have try the two method you give, but it doesn't work. Reference allele has no problem, so as "MT". If there have any other solution?


Best regard!
Lin


2016-07-04 16:25 GMT+08:00 Will McLaren <wm2 at ebi.ac.uk>:

Hello Lin,


There is a problem with your input - the reference allele that you have specified does not match the reference genome sequence. You can have VEP check your input for this issue by adding --check_ref to your command line.


You should also use "MT" to refer to the mitochondrial chromosome in place of "M".


Regards


Will McLaren
Ensembl Variation


On 4 July 2016 at 03:53, qiongfen0 at gmail.com<qiongfen0 at gmail.com> wrote:

Dear Thomas,
Thanks for your reply, it has help me so much. 
Now, I have another confused, I hope you can help me understand this. 
I am using ensembl-tools-release-77 now, when I use VEP to annotate the variants in mitochondria, some has the result of HGVSp but not have HGVSc (like the follow printscreen), it may be quiet strange. Then I use the VEP online  to try again, but it has no result of HGVSc and HGVSp. I wonder what make this result appear.
    
       
Hope to hear form you.


Yours sincerely,
Lin


From: Thomas Maurel
Date: 2016-06-28 17:30
To: Ensembl developers list
Subject: Re: [ensembl-dev] different between eGenetics and GNF/Atlas
Dear Lin,


I am afraid that this data was retired in Ensembl release 76. These might not match as the data is coming from two different sources:


GNF/Atlas data came to us via the Gene Expression Atlas project at EMBL-EBI.

http://www.ebi.ac.uk/gxa/

The GNF/Atlas data was published by the Genomics Institute of the Novartis Research Foundation:
 
http://www.gnf.org/technology/organismal/gene-expression-core.htm
 
The eGenetics database uses Expressed Sequence Tags (ESTs) annotated with eVOC ontology terms by SANBI (South African National Bioinformatics Institute). More information below.

http://www.ncbi.nlm.nih.gov/pubmed/12799354
http://www.sanbi.ac.za/


Hope this helps,
Best Regards,
Thomas
On 28 Jun 2016, at 03:04, qiongfen0 at gmail.com wrote:

Dear Sirs,
I'm using biomart to filter a series of genes which are specifically expressed in the brain. In biomart there are two such filters, 'eGenetics/SANBI EST anatomical system data' and 'GNF/Atlas organism part', however, the results of these two filters don't match. I searched for it at biomart help but i couldn't find anything about this. Can anybody help me to understand the difference of these two filters?
I am looking forward to hearing from you.

Yours sincerely,
Lin

Qiongfen Lin
South China Normal University
TEL: +8615118845463| Mail : qiongfen0 at gmail.com
 
_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/


--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom




_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/





_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/







--


Arron Lin

BGI Research Institute

Email: qiongfen0 at gmail.com

Beishan Industrial Zone| Yantian  District| Shenzhen 518083


_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/





_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/







--


Arron Lin

BGI Research Institute

Email: qiongfen0 at gmail.com

Beishan Industrial Zone| Yantian  District| Shenzhen 518083


_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/





_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160707/96284779/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Catch.jpg
Type: image/jpeg
Size: 75431 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160707/96284779/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Catch3406.jpg
Type: image/jpeg
Size: 35968 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160707/96284779/attachment-0001.jpg>


More information about the Dev mailing list