[ensembl-dev] question variation feature and overlapping/neighbouring gene

Anja Thormann anja at ebi.ac.uk
Mon May 20 19:04:34 BST 2013


Hello Nathalie,

If you iterate over a list of VariationFeatures for a given Slice you can get Gene information using
the get_nearest_Gene or get_overlapping_Genes methods on a VariationFeature. Please note that both methods return listrefs of Gene objects.
We need to update the documentation for get_nearest_Gene.

my @genes = @{$vfh->get_nearest_Gene()};
foreach my $gene (@genes) {
    print 'Nearest gene: ', $gene->external_name, "\t", $gene->stable_id, "\n";
}

my @overlapping_genes = @{$vfh->get_overlapping_Genes()};
foreach my $gene (@overlapping_genes) {
    print 'Overlapping gene: ', $gene->external_name, "\t", $gene->stable_id, "\n";
}

If you made a decision on which Gene you choose you can get all Transcripts for that gene by using
the method get_all_Transcripts() on a Gene object. You should be aware that a Gene can have more than
one transcript.

my $transcripts = $gene->get_all_Transcripts();
foreach my $transcript (@{$transcripts}) {
    print $transcript->stable_id, "\n";
}

The method get_all_TranscriptVariations on a VariationFeature object returns TranscriptVariation objects.
A TranscriptVariation relates a VariationFeature and a Transcript which either overlaps the VariationFeature
or is in close proximity to a VariationFeature. You can retrieve the transcript stable id from a TranscriptVariation
as well:
my $transcript_variations = $vfh->get_all_TranscriptVariations;
foreach my $tv (@{$transcript_variations }) {
    my $id = $tv->transcript->stable_id();
}
A TranscriptVariation has many more attributes and methods which are documented here:
http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1TranscriptVariation.html

Regards,
Anja

On 20 May 2013, at 14:25, Nathalie Conte wrote:

> HI, 
>  I would require some guidance please-
>  I would like from a slice object containing several variation features, to retrieve all variation object and additional information (overlapping gene or ), and print them in a OUT file like this, 
> name of variation:rs12114605	alles:C/T	start:55749184	end:55749184	consequence:INTRONIC		transcript_name:ENST00000521465	gene_name:ENSG00000254608	gene_extname:RP11-56A10.1
> 
> Starting from the variation and associated annotation and retrieving  overlapping gene or neighbouring information- I had a look in the variation feature object and it seems from the variation we can retrieve gene content and neighbouring genes with 2 methods below, although I didn't managed to use them from my variation feature-
> public Bio::EnsEMBL::Gene 	get_nearest_Gene ()
> public List 	get_overlapping_Genes ()
> 
> So I chose to get all the transcripts corresponding to my variation using get_all_TranscriptVariations method (see below), but ovisouly I am missing neighbour gene info. Is this the best way to interrogate the genetic (or neighbouring gene?) content of the variation?
> thanks
> Nathalie
> 
> 
> #variation
> my $human_slice_adaptor = Bio::EnsEMBL::Registry->get_adaptor('Human', 'Core', 'Slice');
> my $human_query_slice = $human_slice_adaptor->fetch_by_region('chromosome',$genomic_align->dnafrag->name,$genomic_align->dnafrag_start,$genomic_align->dnafrag_end);
> 
> ###  get all variations in human slice
> my $vfh_adaptor = Bio::EnsEMBL::Registry->get_adaptor('homo_sapiens', 'variation', 'variationfeature'); #get adaptor to VariationFeature object
> my $gene_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human', 'core', 'gene');
> 
> my $vfsh = $vfh_adaptor->fetch_all_by_Slice($human_query_slice); #return ALL variations defined in $slice
> 
> foreach my $vfh (@{$vfsh}){ 
> 	print   OUT join "\t", $vfh->variation_name, $vfh->allele_string,  $vfh->seq_region_start,$vfh->seq_region_end,$vfh->most_severe_OverlapConsequence->display_term, "\t";	;#this is where i need to check the variation gene content
> #get transcripts corresponding to the variation features. 
> 	my $transcripts=$vfh->get_all_TranscriptVariations;	
> 		foreach my $transcript (@{$transcripts}) {
> if (!@{$transcripts}){print 'no_gene',"\t";}
> 		print OUT 'transcript',"\t",my $id=$transcript->transcript_stable_id,"\t";
> 		my $gene = $gene_adaptor->fetch_by_transcript_stable_id($id);
>      		print OUT $gene->stable_id, "\t",$gene->external_name,"\n";
> 		      		#print OUT $gene->stable_id if (defined $gene->external_name),"\t",$gene->external_name,"\n";
> 		}

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130520/9a1bb2e3/attachment.html>


More information about the Dev mailing list