[ensembl-dev] why would a snp have multiple consequences in the same transcript

Andreas Kahari ak at ebi.ac.uk
Fri Nov 19 15:54:49 GMT 2010

On Fri, Nov 19, 2010 at 03:22:21PM +0000, Andrea Edwards wrote:
> Hi
> I think I have asked this question before but I can't find the
> answer in my archive of answers so I'm really really sorry about
> this.
> Why would a SNP have multiple consequences in a single transcript?
> This code returns an array of consequences:
> my @tvs = @{$vf->get_all_TranscriptVariations};
>  foreach my $tv (@tvs) {
>     my @consequences = @{$tv->consequence_type};
> where $vf is a variation feature and $tv is a transcript variant.
> I thought perhaps this could be a convention issue with the api as
> you generally return array references from functions
> I understand a SNP could have different consequences in the same
> gene as it might have a different impact on each splice variant, but
> how can it have multiple consequences in a single transcript?
> I've looked at the consequence types and they do appear to be
> mutually exclusive. At best something could be both synonymous and
> splice site (or non-syn and  splice site) if it occurs in the
> first/last few bases of an exon
> I apologise for the duplicate question.
> Thanks in advance for your help

Not knowing much about the theory behind the variation data but just
looking at the latest released variation database for human, we have:

(picking and counting the transcript variation features that has more
than one consequence, i.e. a consequence_type with a comma in it)
mysql> select count(1), consequence_type from transcript_variation where consequence_type like "%,%" group by consequence_type;
| count(1) | consequence_type                                 |
|      128 | STOP_GAINED,FRAMESHIFT_CODING                    |
|      263 | STOP_GAINED,SPLICE_SITE                          |
|       47 | COMPLEX_INDEL,SPLICE_SITE                        |
|     1609 | FRAMESHIFT_CODING,SPLICE_SITE                    |
|     6187 | NON_SYNONYMOUS_CODING,SPLICE_SITE                |
|     3496 | SPLICE_SITE,SYNONYMOUS_CODING                    |
|     1263 | SPLICE_SITE,5PRIME_UTR                           |
|      349 | SPLICE_SITE,3PRIME_UTR                           |
|    31937 | ESSENTIAL_SPLICE_SITE,INTRONIC                   |
|    57738 | SPLICE_SITE,INTRONIC                             |
|      607 | STOP_GAINED,NMD_TRANSCRIPT                       |
|        1 | STOP_LOST,NMD_TRANSCRIPT                         |
|        5 | COMPLEX_INDEL,NMD_TRANSCRIPT                     |
|     2076 | FRAMESHIFT_CODING,NMD_TRANSCRIPT                 |
|     7647 | SYNONYMOUS_CODING,NMD_TRANSCRIPT                 |
|     5016 | 5PRIME_UTR,NMD_TRANSCRIPT                        |
|       33 | SPLICE_SITE,5PRIME_UTR,NMD_TRANSCRIPT            |
|    46181 | 3PRIME_UTR,NMD_TRANSCRIPT                        |
|      434 | SPLICE_SITE,3PRIME_UTR,NMD_TRANSCRIPT            |
|  2213159 | INTRONIC,NMD_TRANSCRIPT                          |
|     4024 | SPLICE_SITE,INTRONIC,NMD_TRANSCRIPT              |
29 rows in set (0.00 sec)

So there's quite a lot of variations that have more than one type of
consequence in a transcript.

Let's look at one group of these, the four variations with consequence

mysql> select tv.transcript_stable_id, vf.allele_string, vf.variation_name from transcript_variation tv join variation_feature vf using (variation_feature_id) where tv.consequence_type = 'STOP_GAINED,FRAMESHIFT_CODING,NMD_TRANSCRIPT';
| transcript_stable_id | allele_string | variation_name |
| ENST00000458701      | C/A/T/-/G     | rs41556120     |
| ENST00000426590      | C/A/T/-       | rs41559415     |
| ENST00000466779      | G/A/-         | rs6474         |
| ENST00000469053      | G/A/-         | rs6474         |
4 rows in set (0.00 sec)

So, a SNP can obviously have more than one consequence because it does
not necessarily provide only one other possible base in the given


Andreas Kähäri, Ensembl Software Developer
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus
Hinxton, Cambridge CB10 1SD, United Kingdom

More information about the Dev mailing list