[ensembl-dev] Plugin: ExAC print AC/AN for each population

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Mon Nov 2 11:32:49 GMT 2015


OK got it, updated version can be found on my Github mentioned on 
previous mail.

The point is that "AN" was being used to get total AN, however all the 
population frequencies in ExAC are adjusted. AN_adj should be used to 
provide consistent information to the one shown in ExAC website.
In my last commit I've added this logic, It will use AN_adj if exists 
and not zero. This shouldn't modify population allele frequencies but 
the total AN count only.

Don't know if it will be useful for you too guys.

Regards,
Guillermo.

On 02/11/15 10:43, Guillermo Marco Puche wrote:
> Hello Will,
>
> I've come up with a modified version of your ExAC plugin that writes 
> AC/AN counts per pop to the output you can check the code here: 
> https://github.com/guillermomarco/vep_plugins/blob/master/ExAC.pm
>
> However debugging this for the new columns I've found a bug. I'm 
> testing the script with the following mutation that has ExAC 
> information (http://exac.broadinstitute.org/variant/22-46615880-T-C):
>
> 22      46615880        rs1800234       T       C       . .       AA=T
>
> All the counts per population and frequencies are identical to those 
> in ExAC website, however for the totals I'm getting:
>
>   * Allele count: 1163 (OK)
>   * Allele number: 121412 (this differs from ExAC website that
>     shows*120986**and its the correct sum total of all the pops)*
>
> Maybe there's something else being added up, but I can't find where's 
> the problem since all individual allele pop counts are OK. Gonna check 
> ExAC VCF for this position to see if I can any clue.
>
> Regards,
> Guillermo.
>
> On 02/11/15 09:23, Guillermo Marco Puche wrote:
>> Hello Will,
>>
>> I'm interested to print AC and AN count for each population in ExAC 
>> VEP plugin output. I've modified the following lines to store the data:
>>
>>    foreach my $a(@vcf_alleles) {
>>      my $ac =shift @ac;
>>      $total_ac += $ac;
>>      $data->{$a}->{'ExAC_'.$afh.'_AC'} = $ac;
>>      $data->{$a}->{'ExAC_'.$afh.'_AN'} = $an;
>>      $data->{$a}->{'ExAC_'.$afh} = sprintf("%.3g", $ac / $an);
>>    }
>>
>>    # use total to get ref allele freq $data->{$ref_allele}->{'ExAC_'.$afh.'_AC'} = $total_ac;
>>    $data->{$ref_allele}->{'ExAC_'.$afh.'_AN'} = $an;
>>    $data->{$ref_allele}->{'ExAC_'.$afh} = sprintf("%.3g",1 - ($total_ac / $an));
>> }
>> However I'm getting a bit messy when it comes to header function. It 
>> loops over ExAC header, but I don't know how to modify 
>> get_header_info() to output the desired columns.
>>
>>
>> Regards,
>> Guillermo.
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20151102/ece415c9/attachment.html>


More information about the Dev mailing list