[ensembl-dev] How duplicated variants in gnomAD v2 are handled in VEP when remapped to GRCh38?

Wallace Ko myko at l3-bioinfo.com
Tue Mar 5 08:12:27 GMT 2024


Hello,

We just found that VEP 111 (CLI with cache or Ensembl web) reports no
gnomAD v3 AFs (gnomADg...) for the
variant 12-7190512-GGCCTCTGAGGCAGTGAGTGTTCTTGAGGTGGAAAGCCCAGGTGCA-G
(GRCh38).

However, the record can be found in gnomAD website (
https://gnomad.broadinstitute.org/variant/12-7190512-GGCCTCTGAGGCAGTGAGTGTTCTTGAGGTGGAAAGCCCAGGTGCA-G?dataset=gnomad_r3),
gnomAD VCF (
https://gnomad-public-us-east-1.s3.amazonaws.com/release/3.1.2/vcf/genomes/gnomad.genomes.v3.1.2.sites.chr12.vcf.bgz)
and also VCF from Ensembl (
http://ftp.ensembl.org/pub/data_files/homo_sapiens/GRCh38/variation_genotype/gnomad/v3.1.2/gnomad.genomes.v3.1.2.sites.chr12_trimmed_info.vcf.bgz).
So I wonder why VEP isn't reporting the corresponding numbers.

I'm just posting it in this thread because it's regarding gnomAD AF. Not
sure if it's related.

Regards,
Wallace Ko


On Wed, Feb 28, 2024 at 4:22 PM Wallace Ko <myko at l3-bioinfo.com> wrote:

> Hello,
>
> I found another case where the VEP result is not the same as that of the
> Ensembl website. Please see if this helps your investigation.
>
> For the GRCh38 variant 1-13260172-G-A, 3 records are observed in the VCF:
>
> $ bcftools query -Hf '%CHROM %POS %ID %REF %ALT %AC %AN %AF\n' -r
> chr1:13260172 -i 'ALT="A"'
> http://ftp.ensembl.org/pub/data_files/homo_sapiens/GRCh38/variation_genotype/gnomad/r2.1.1/exomes/gnomad.exomes.r2.1.1.sites.1.liftover_grch38_no_VEP.vcf.gz
> | column -ts' '
> #[1]CHROM  [2]POS    [3]ID        [4]REF  [5]ALT  [6]AC  [7]AN   [8]AF
> chr1       13260172  rs4026293    G       A       76892  147882  0.519955
> chr1       13260172  rs200091507  G       A       33     49948
> 0.000660687
> chr1       13260172  .            G       A       54     191992
>  0.000281262
>
> In VEP 111, gnomADe_AF=0.0002813, which corresponds to the 3rd record in
> the VCF.
>
> In Ensembl website (
> https://ensembl.org/Homo_sapiens/Variation/Population?db=core;r=1:13260172-13260172;v=rs1553173632;vdb=variation;vf=52479317),
> it shows that AF for allele A is 0.520, which corresponds to the first
> record in the VCF.
>
> Regards,
> Wallace Ko
>
>
> On Tue, Feb 27, 2024 at 5:32 PM Wallace Ko <myko at l3-bioinfo.com> wrote:
>
>> Hello Ensembl Team,
>>
>> Some variants in GRCh37 can be remapped to the same location in GRCh38.
>> E.g. 1-13497572-G-A (
>> https://gnomad.broadinstitute.org/variant/1-13497572-G-A?dataset=gnomad_r2_1)
>> and 1-13718406-G-A (
>> https://gnomad.broadinstitute.org/variant/1-13718406-G-A?dataset=gnomad_r2_1)
>> in GRCh37 are both remapped to 1-13391946-G-A in GRCh38 according to gnomAD
>> webpages.
>>
>> From the gnomAD v2 VCF in Ensembl FTP, both records are observed:
>>
>> $ bcftools view -Ou
>> http://ftp.ensembl.org/pub/data_files/homo_sapiens/GRCh38/variation_genotype/gnomad/r2.1.1/exomes/gnomad.exomes.r2.1.1.sites.1.liftover_grch38_no_VEP.vcf.gz
>> -i 'ALT="A"' chr1:13391946-13391946 | bcftools query -f '%CHROM
>> %POS %ID %REF %ALT %AC %AN %AF\n'
>> chr1 13391946 rs199881782 G A 6177 85026 0.0726484
>> chr1 13391946 rs200047809 G A 72 211396 0.000340593
>>
>> From VEP 111, I obtained gnomADe_AF=0.0003406 for this variant.
>>
>> I wonder how VEP determines to report the AF of rs200047809, instead
>> of rs199881782 or arithmetic mean of both.
>>
>> Regards,
>> Wallace Ko
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20240305/89758d39/attachment.html>


More information about the Dev mailing list