[ensembl-dev] Error messages when using the Variation API

Anja Thormann anja at ebi.ac.uk
Mon May 23 12:38:43 BST 2016


Hi Johannes,

> 
> Another thing: I have noticed that when I run the same script, with the same list of rsids as input, I get different outputs of variants in LD if I use hg38 and hg19. There are more LD-variants found with hg19, while I thought that it would be the other way around, or similar amount. Do you know why?

We haven’t noticed any major differences for LD computation between the different assemblies. Could you please share some examples which we can use for looking into this? 


> 
> I previously got errors with variants mapping to alternate loci on hg38, and your colleague suggested that I only use loci mapped to the reference genome:  next unless ($vf->slice->is_reference); Could this affect the amount of SNPs from LD in hg38? Have you fixed the issue?
> 

This shouldn’t have an effect on the number of variants that are returned. We will limit LD computation for variants on the reference sequence only for the next release 85 planned for late July.


Thank you,
Anja

> Best,
> Johanne
> 
> 
>> 23. mai 2016 kl. 11.57 skrev Will McLaren <wm2 at ebi.ac.uk <mailto:wm2 at ebi.ac.uk>>:
>> 
>> Hi Johanne,
>> 
>> It looks like the API is intermittently losing connection to the remote VCF files hosted on our FTP site.
>> 
>> You can bypass this connection by downloading the files to your local machine:
>> 
>> GRCh38: ftp://ftp.ensembl.org/pub/variation_genotype/homo_sapiens/ <ftp://ftp.ensembl.org/pub/variation_genotype/homo_sapiens/>
>> GRCh37: ftp://ftp.ensembl.org/pub/grch37/release-82/variation/vcf/homo_sapiens/1000GENOMES-phase_3-genotypes/ <ftp://ftp.ensembl.org/pub/grch37/release-82/variation/vcf/homo_sapiens/1000GENOMES-phase_3-genotypes/>
>> 
>> You will then need to edit [module_path]/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/vcf_config.json, changing the "filename_template" entry to the path where you downloaded the files, and "type" from "remote" to "local".
>> 
>> Regarding the warning message, this should not affect your analyses in any way, but I have put in a fix on release/84 of ensembl-variation to suppress it.
>> 
>> Regards
>> 
>> Will McLaren
>> Ensembl Variation
>> 
>> On 21 May 2016 at 12:57, Johanne Håøy Horn <johannhh at ifi.uio.no <mailto:johannhh at ifi.uio.no>> wrote:
>> Dear ensembl dev team,
>> 
>> I have been using your variation API for some time now, and get a range of errors from time to time, without knowing exactly why. It is not because of the scripts, I think, as the same script producing the error can work just fine if I run it again.
>> 
>> The different error messages are:
>> /Parser/BaseVCF4.pm line 891, <IN> line 5.
>> Use of uninitialized value in list assignment at /Users/Johanne/src/ensembl-io/modules/Bio/EnsEMBL/IO/Parser/BaseVCF4.pm line 891, <IN> line 5.
>> connect: Operation timed out
>> [kftp_connect_file] 350 Restarting at 654385206. Send STORE or RETRIEVE to initiate transfer
>>   
>> [kftp_connect_file] 227 Entering Passive Mode (193,62,203,85,220,250).
>> Tabix::tabix_query: t is not of type tabix_tPtr at /Users/Johanne/src/ensembl-io/modules/Bio/EnsEMBL/IO/TabixParser.pm line 70.
>>   
>> [kftp_connect_file] 227 Entering Passive Mode (193,62,203,85,157,134).
>> [main] fail to open the data file.
>> Can't use an undefined value as an ARRAY reference at /Users/Johanne/src/ensembl-io/modules/Bio/EnsEMBL/IO/Parser/BaseVCF4.pm line 730.
>> 
>> Usually just one of these error occur at a time. I suspect it might have something to do with the connection between my computer and the ensembl database, as the first error at least always show up in repeats when I lose my Internet connection. However, are all of them caused by Internet trouble? I have checked that the MySQL instance is up and running, and can visit web pages through a browser when some of the errors occur. Could it be something on the server/database side?
>> 
>> Also, if I use the GRCh37 database:
>> 
>> $registry->load_registry_from_db(
>>   -host => 'ensembldb.ensembl.org <http://ensembldb.ensembl.org/>',
>>   -user => 'anonymous',
>>   -port => 3337,
>> );
>> 
>> I get this warning/printout:
>> Use of uninitialized value $nums{"."} in numeric comparison (<=>) at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/VCFCollection.pm line 770, <IN> line 6.
>> 
>> I use version 84 of the Ensembl API, OS X 10.11.5, and the script I use when all of these errors occur, is attached. Note that the attached script by default uses hg38, but will produce the last printout mentioned when switching to hg37.
>> 
>> And something different I have been wondering about:
>> The VCF files that are downloaded locally (ALL.chr1.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.GRCh38_dbSNP.vcf.gz.tbi, for instance) - should they be deleted and re-downloaded from time to time to get the latest 1000G data? And where exactly are the VCFs downloaded from? Is it dbSNP, as indicated in the file name?
>> 
>> Best,
>> Johanne Håøy Horn
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
>> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
>> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160523/af591c6e/attachment.html>


More information about the Dev mailing list