[ensembl-dev] Quick question about --af_esp option in VEP

Joseph A Prinz joseph.prinz at duke.edu
Fri Mar 1 17:03:29 GMT 2019


Hi Laurent,

Thank you so much for your clarifications--this really helps.

Also, I am quite sure that I am not using --no_check_alleles and according to the docs this should be off by default.

Best,
Joey

________________________________________
From: Laurent Gil <lgil at ebi.ac.uk>
Sent: Friday, March 1, 2019 6:28 AM
To: Ensembl developers list; Joseph A Prinz
Subject: Re: [ensembl-dev] Quick question about --af_esp option in VEP

Dear Joseph,

Thank you for reporting this.


By default/design the JSON output only displays in "frequency" the frequencies related to the alternative allele(s) you used in your input file, e.g. :


# Your input:

    14:105415607 C/G

# Your output:

    "AA": "T:0",
    "EA": "T:0.0006983",

    "frequencies": {

          "G": {
            "afr": "0.0015",
            "amr": "0",
            "eas": "0.001",
            "eur": "0",

            "gnomad": "0.0002357",

            ...}

    },

    "allele_string": "C/A/G/T",

    "id":"rs112699389"


However it also returns the corresponding minor allele frequency for ESP populations in a separate part of the JSON if the alternative differs from your alternative allele. In the example above, the alternative allele for rs112699389 in the ESP populations is T.


For the second issue, it looks like you are using the flag "--no_check_alleles<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ensembl.org_info_docs_tools_vep_script_vep-5Foptions.html-23opt-5Fno-5Fcheck-5Falleles&d=DwMD-g&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=MyiOXAbvDPHjfcbbrGErkARYNzXyIRtvJJwraJIIOZY&m=coWL8H6tj72DIVE5suen8RNbUPbnezlutD-vqx96vGA&s=6hdGNi3AaDVgeLlppEOAG-xZfn3baOP9TcdWsUotfIs&e=>". This flag doesn't compare which is/are your input alternative allele(s) with the co-located alleles and simply list the frequencies of each alternative allele of the co-located variants, e.g.:


# Input:

    2:36785668 A/T

# Output:

    "AA":"AA:0.01702,-:0.02651",
    "EA":"AA:0.01386,-:0.02891",
    "gnomAD_SAS":"-:0.1864,AA:0.08409",
    "gnomAD_OTH":"-:0.1465,AA:0.05941",
    "gnomAD_EAS":"-:0.1653,AA:0.08157",
    "gnomAD_ASJ":"-:0.1746,AA:0.05098",
    "gnomAD_AMR":"-:0.1808,AA:0.08017",
    "gnomAD":"-:0.1349,AA:0.05111",
    "gnomAD_NFE":"-:0.1172,AA:0.03693",
    "gnomAD_FIN":"-:0.09683,AA:0.029",
    "allele_string":"A/AA/-",

    "id":"TMP_ESP_2_36785668_36785668",


I agree that these different ways to display the frequencies of the co-located variants can be confusing and we will try to make it easier to read/parse.


Best regards,

Laurent
Ensembl Variation


On 28/02/2019 20:28, Joseph A Prinz wrote:
A follow-up:

I am also seeing oddly parsed frequencies for 'gnomAD' see below.

I am using VEP version 95.1

Thanks again,
Joey

'colocated_variants': [
{ 'AA': '-:0.02651',
 'EA': '-:0.02891',
 'allele_string': 'A/-',
 'end': '36785668',
 'gnomAD': '-:0.1349',
 'gnomAD_AFR': '-:0.104',
 'gnomAD_AMR': '-:0.1808',
 'gnomAD_ASJ': '-:0.1746',
 'gnomAD_EAS': '-:0.1653',
 'gnomAD_FIN': '-:0.09683',
 'gnomAD_NFE': '-:0.1172',
 'gnomAD_OTH': '-:0.1465',
 'gnomAD_SAS': '-:0.1864',
 'id': 'rs756392461',
 'start': '36785668',
 'strand': '1'},
{ 'AA': 'AA:0.01702,-:0.02651',
 'EA': 'AA:0.01386,-:0.02891',
 'allele_string': 'A/AA/-',
 'end': '36785668',
 'gnomAD': '-:0.1349,AA:0.05111',
 'gnomAD_AFR': '-:0.104,AA:0.05793',
 'gnomAD_AMR': '-:0.1808,AA:0.08017',
 'gnomAD_ASJ': '-:0.1746,AA:0.05098',
 'gnomAD_EAS': '-:0.1653,AA:0.08157',
 'gnomAD_FIN': '-:0.09683,AA:0.029',
 'gnomAD_NFE': '-:0.1172,AA:0.03693',
 'gnomAD_OTH': '-:0.1465,AA:0.05941',
 'gnomAD_SAS': '-:0.1864,AA:0.08409',
 'id': 'TMP_ESP_2_36785668_36785668',
 'start': '36785668',
 'strand': '1'}
]




________________________________________
From: Joseph A Prinz
Sent: Thursday, February 28, 2019 12:42 PM
To: dev at ensembl.org<mailto:dev at ensembl.org>
Subject: Quick question about --af_esp option in VEP

Hi VEP devs,

I wanted to confirm the output of --af_esp when using JSON output.

Some times I see values represented 'aa' and 'ea' nested under 'frequencies', and sometimes I see values "AA" and "EA" not nested under 'frequencies'.
Further when appearing as "AA" / "EA" the values do not seem to be parsed in the same way (they appear as "allele:frequency").

I am not sure which values to use, and neither are keyed as 'aa_af' or 'ea_af' as I would expect from the documentation.

Below are examples of both scenarios.

Thanks for taking a look!
Joey

"colocated_variants": [
  {
    "allele_string": "A/G/T",
    "end": "69511",
    "frequencies": {
      "G": {
        "aa": "0.5441",
        "ea": "0.8874",
        "gnomad": "0.9506",
        "gnomad_afr": "0.6074",
        "gnomad_amr": "0.9508",
        "gnomad_asj": "0.9779",
        "gnomad_eas": "0.9995",
        "gnomad_fin": "0.9915",
        "gnomad_nfe": "0.9728",
        "gnomad_oth": "0.9499",
        "gnomad_sas": "0.9854"
      }
    },
    "id": "rs2691305",
    "start": "69511",
    "strand": "1"
  }
]

"colocated_variants": [
  {
    "AA": "T:0",
    "EA": "T:0.0006983",
    "allele_string": "C/A/G/T",
    "end": "105415607",
    "frequencies": {
      "G": {
        "afr": "0.0015",
        "amr": "0",
        "eas": "0.001",
        "eur": "0",
        "gnomad": "0.0002357",
        "gnomad_afr": "0.0002741",
        "gnomad_amr": "0.0005474",
        "gnomad_asj": "0",
        "gnomad_eas": "0.0004838",
        "gnomad_fin": "0",
        "gnomad_nfe": "1.384e-05",
        "gnomad_oth": "0",
        "gnomad_sas": "0.0006331",
        "sas": "0.0133"
      }
    },
    "id": "rs112699389",
    "minor_allele": "T",
    "minor_allele_freq": "0.0072",
    "start": "105415607",
    "strand": "1"
  }
]









_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ensembl.org_mailman_listinfo_dev&d=DwMD-g&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=MyiOXAbvDPHjfcbbrGErkARYNzXyIRtvJJwraJIIOZY&m=coWL8H6tj72DIVE5suen8RNbUPbnezlutD-vqx96vGA&s=gs-SQoOqJD3RLWOgdQbk4r82iBHfib5ZgRkYRtMegyI&e=>
Ensembl Blog: http://www.ensembl.info/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.info_&d=DwMD-g&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=MyiOXAbvDPHjfcbbrGErkARYNzXyIRtvJJwraJIIOZY&m=coWL8H6tj72DIVE5suen8RNbUPbnezlutD-vqx96vGA&s=V4RXiOhQ2Ug5nw032a8TY3C7HmabUF9n0CaZe7qXmjI&e=>




More information about the Dev mailing list