[ensembl-dev] VEP annotation of ENSP

Duarte Molha Duarte.Molha at ogt.co.uk
Wed Nov 16 11:39:24 GMT 2011


Hi there.

I have been trying to add a bit more functionality to the VEP script and have created a script that adds annotation to the protein annotation...

I have changed the VEP script at lines 754-757 :

    # protein ID
    if(defined $config->{protein} && $t->translation) {
                 $line->{Extra}->{ENSP} = $t->translation->stable_id;
    }

To:

                # protein ID
                if(defined $config->{protein} && $tv->transcript->translation) {
                                my $protein_feature_analysis = get_protein_domains($tv);
                                if ($protein_feature_analysis){
                                                chomp $protein_feature_analysis;
                                                $line->{Extra}->{ENSP} = $protein_feature_analysis;
                                }
                }

And included a sub to get a bit more detail about protein domains it overlaps:

sub get_protein_domains{
                my $tv = shift;

                ###################################  protein ID ######################################################
                my $translation_id = $tv->transcript->translation->stable_id;
                my %protein_features =();
                my $pfeatures = $tv->transcript->translation->get_all_ProteinFeatures();
                foreach my $pfeature (@{$pfeatures}){
                                my $logic_name = $pfeature->analysis()->logic_name();
                                if ($pfeature->start >= $tv->transcript->translation->start && $pfeature->end <= $tv->transcript->translation->end){
                                                $protein_features{$logic_name}{ENSP}                = $translation_id || "-" ;
                                                $protein_features{$logic_name}{interpro_ac}   = $pfeature->interpro_ac() || "-";
                                                $protein_features{$logic_name}{idesc}                = $pfeature->idesc()||"-";
                                                $protein_features{$logic_name}{start}                 = $pfeature->start;
                                                $protein_features{$logic_name}{end}                  = $pfeature->end;
                                }
                }
                my $protein_feature_analysis = $translation_id;

                for my $analysis ( keys %protein_features ){
                                $protein_feature_analysis .= ":".$analysis;
                                $protein_feature_analysis .= ",".$protein_features{$analysis}{interpro_ac};
                                $protein_feature_analysis .= ",".$protein_features{$analysis}{idesc};
                                $protein_feature_analysis .= ",".$protein_features{$analysis}{start};
                                $protein_feature_analysis .= ",".$protein_features{$analysis}{end};
                }
                #######################################################################################################

                return $protein_feature_analysis;
}

Unfortunately it seems that this annotation is only working for the mitochondrial chromosome.
Could you point me to where I might be doing something wrong?

Best regards

Duarte Molha

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20111116/c677ca02/attachment.html>


More information about the Dev mailing list