[ensembl-dev] Error messages when using the Variation API

Will McLaren wm2 at ebi.ac.uk
Mon May 23 13:04:31 BST 2016


Hi Johanne,

You need the filename part of the template too, so:

 "filename_template":
"~/src/ensembl-vcf/ALL.chr###CHR###.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.GRCh38_dbSNP.vcf.gz",

Regards

Will

On 23 May 2016 at 12:57, Johanne Håøy Horn <johannhh at ifi.uio.no> wrote:

> Hello again!
>
> I tried set the following in the JSON file:
>
>  {
>       "id": "1000genomes_phase3",
>       "species": "homo_sapiens",
>       "assembly": "GRCh38",
>       "type": "local",
>       "strict_name_match": 1,
>       "filename_template": "~/src/ensembl-vcf/",
>       "chromosomes": [
>         "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
> "13", "14",
>         "15", "16", "17", "18", "19", "20", "21", "22", "X", "Y"
>       ],
>       "sample_prefix": "1000GENOMES:phase_3:"
>     },
>
> But I get this error message:
> MSG: ERROR: VCF file ~/src/ensembl-vcf/ not found
>
> I downloaded all the hg38 files you linked to in the
> folder ~/src/ensembl-vcf/. When  you say that I need to change
> filename_template to the path where the files were downloaded, is it the
> full path of all the 48 files rather than the path to the folder they are
> in?
>
> Best,
> Johanne
>
> 23. mai 2016 kl. 11.57 skrev Will McLaren <wm2 at ebi.ac.uk>:
>
> Hi Johanne,
>
> It looks like the API is intermittently losing connection to the remote
> VCF files hosted on our FTP site.
>
> You can bypass this connection by downloading the files to your local
> machine:
>
> GRCh38: ftp://ftp.ensembl.org/pub/variation_genotype/homo_sapiens/
> GRCh37:
> ftp://ftp.ensembl.org/pub/grch37/release-82/variation/vcf/homo_sapiens/1000GENOMES-phase_3-genotypes/
>
> You will then need to edit
> [module_path]/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/vcf_config.json,
> changing the "filename_template" entry to the path where you downloaded the
> files, and "type" from "remote" to "local".
>
> Regarding the warning message, this should not affect your analyses in any
> way, but I have put in a fix on release/84 of ensembl-variation to suppress
> it.
>
> Regards
>
> Will McLaren
> Ensembl Variation
>
> On 21 May 2016 at 12:57, Johanne Håøy Horn <johannhh at ifi.uio.no> wrote:
>
>> Dear ensembl dev team,
>>
>> I have been using your variation API for some time now, and get a range
>> of errors from time to time, without knowing exactly why. It is not because
>> of the scripts, I think, as the same script producing the error can work
>> just fine if I run it again.
>>
>> The different error messages are:
>> /Parser/BaseVCF4.pm line 891, <IN> line 5.
>> Use of uninitialized value in list assignment at
>> /Users/Johanne/src/ensembl-io/modules/Bio/EnsEMBL/IO/Parser/BaseVCF4.pm
>> line 891, <IN> line 5.
>> connect: Operation timed out
>> [kftp_connect_file] 350 Restarting at 654385206. Send STORE or RETRIEVE
>> to initiate transfer
>>
>> [kftp_connect_file] 227 Entering Passive Mode (193,62,203,85,220,250).
>> Tabix::tabix_query: t is not of type tabix_tPtr at
>> /Users/Johanne/src/ensembl-io/modules/Bio/EnsEMBL/IO/TabixParser.pm line 70.
>>
>> [kftp_connect_file] 227 Entering Passive Mode (193,62,203,85,157,134).
>> [main] fail to open the data file.
>> Can't use an undefined value as an ARRAY reference at
>> /Users/Johanne/src/ensembl-io/modules/Bio/EnsEMBL/IO/Parser/BaseVCF4.pm
>> line 730.
>>
>> Usually just one of these error occur at a time. I suspect it might have
>> something to do with the connection between my computer and the ensembl
>> database, as the first error at least always show up in repeats when I lose
>> my Internet connection. However, are all of them caused by Internet
>> trouble? I have checked that the MySQL instance is up and running, and can
>> visit web pages through a browser when some of the errors occur. Could it
>> be something on the server/database side?
>>
>> Also, if I use the GRCh37 database:
>>
>> $registry->load_registry_from_db(
>>   -host => 'ensembldb.ensembl.org',
>>   -user => 'anonymous',
>>   -port => 3337,
>> );
>>
>> I get this warning/printout:
>> Use of uninitialized value $nums{"."} in numeric comparison (<=>) at
>> /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/VCFCollection.pm
>> line 770, <IN> line 6.
>>
>> I use version 84 of the Ensembl API, OS X 10.11.5, and the script I use
>> when all of these errors occur, is attached. Note that the attached script
>> by default uses hg38, but will produce the last printout mentioned when
>> switching to hg37.
>>
>> And something different I have been wondering about:
>> The VCF files that are downloaded locally
>> (ALL.chr1.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.GRCh38_dbSNP.vcf.gz.tbi,
>> for instance) - should they be deleted and re-downloaded from time to time
>> to get the latest 1000G data? And where exactly are the VCFs downloaded
>> from? Is it dbSNP, as indicated in the file name?
>>
>> Best,
>> Johanne Håøy Horn
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160523/78db61a2/attachment.html>


More information about the Dev mailing list