[ensembl-dev] VEP distance cutoff

Kieron Taylor ktaylor at ebi.ac.uk
Mon Feb 17 16:24:13 GMT 2014


Hi,

I'm working on a distinctly beta implementation. The method will reside 
in our BaseFeatureAdaptor and as such will be available in any 
conventional Adaptor, e.g. GeneAdaptor, RegulatoryFeatureAdaptor

If the code turns out really well and withstands severe testing, I can 
release it as a hotfix, but its first stable release should be in 
Ensembl 76 later this year.


Kieron

-- 
Kieron Taylor PhD.
Ensembl Core team
EBI

On 17/02/2014 16:14, Genomeo Dev wrote:
> Hi Nathan,
>
> That is very interesting. How far are you with this method? Does it have
> a name yet? For linking Genes to RegulatoryFeatures, what data this
> method is based on?
>
> G.
>
>
> On 17 February 2014 13:08, njohnson <njohnson at ebi.ac.uk
> <mailto:njohnson at ebi.ac.uk>> wrote:
>
>     Incidentally, our core software team is currently working on a
>     method to associate any feature with a nearest gene.  In future this
>     is likely to be used as part of a strategy to make
>     Gene-RegulatoryFeature links.
>
>     Nathan Johnson
>
>     Ensembl Regulation
>     European Bioinformatics Institute (EMBL-EBI)
>     European Molecular Biology Laboratory
>     Wellcome Trust Genome Campus
>     Hinxton
>     Cambridge CB10 1SD
>     United Kingdom
>
>     http://www.ensembl.info/
>     http://twitter.com/#!/ensembl
>     https://www.facebook.com/Ensembl.org
>
>     On 17 Feb 2014, at 12:36, Will McLaren <wm2 at ebi.ac.uk
>     <mailto:wm2 at ebi.ac.uk>> wrote:
>
>      > Yes, your assumption is correct - currently we do not carry any
>     data linking regulatory features to specific genes, for exactly the
>     reasons you state.
>      >
>      > Will
>      >
>      >
>      > On 17 February 2014 12:31, Genomeo Dev <genomeodev at gmail.com
>     <mailto:genomeodev at gmail.com>> wrote:
>      > Thanks.
>      >
>      > With regard to --regulatory option, Having run VEP with 1000G
>     variants using this option I found that, in the output, whenever a
>     variant is predicted to have a Feature type: RegulatoryFeature or
>     MotifFeature, there is no entry in the GENE or SYMBOL columns and
>     also unlike for variants with Feature type Transcript, CELL_TYPE is
>     populated.
>      >
>      > Is this a reflection of the fact that current data in the Ensembl
>     regulatory build are (1) cell-type specific (2) genome-wide
>     profiling experiments which don't associate regulatory regions to
>     the actual genes whose expression is being regulated?
>      >
>      > Thanks,
>      >
>      > G.
>      >
>      >
>      >
>      >
>      >
>      > On 17 February 2014 10:18, Will McLaren <wm2 at ebi.ac.uk
>     <mailto:wm2 at ebi.ac.uk>> wrote:
>      > Hello,
>      >
>      > The VEP looks at +/- 5KB either side of each transcript's start
>     and end coordinates; these coordinates are inclusive of any UTR
>     regions. A gene is defined by the furthest-reaching 5' and 3'
>     coordinates of the transcripts in that gene. You might find the
>     diagram on this page useful:
>      >
>      >
>     http://www.ensembl.org/info/genome/variation/predicted_data.html#consequences
>      >
>      > The VEP separately annotates regulatory regions (in human and
>     mouse at least) as determined by the Ensembl regulatory build; to
>     enable this just add --regulatory to your VEP command.
>      >
>      > http://www.ensembl.org/info/genome/funcgen/regulatory_build.html
>      >
>      >
>     http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_regulatory
>      >
>      > Regards
>      >
>      > Will McLaren
>      > Ensembl Variation
>      >
>      >
>      > On 17 February 2014 10:04, Genomeo Dev <genomeodev at gmail.com
>     <mailto:genomeodev at gmail.com>> wrote:
>      > Hi,
>      >
>      > Thanks for the response. For a given variant, I assume VEP looks
>     at the interval [-5KB, +5KB] and assigns as neighbours any genes
>     which overlap with that region. How are genes defined in this case?
>     Does VEP look only for overlapping TSS or the entire TSS<->TES region?
>      >
>      > How about variants which are documented in the literature to
>     occur in enhancers which are say 1 MB from the target gene? Do these
>     get taken into account on top of the 5KB rule?
>      >
>      > Thanks,
>      >
>      > G.
>      >
>      >
>      >
>      > On 5 February 2014 09:52, Genomeo Dev <genomeodev at gmail.com
>     <mailto:genomeodev at gmail.com>> wrote:
>      > Thanks very much.
>      >
>      > G.
>      >
>      >
>      > On 4 February 2014 22:02, Will McLaren <wm2 at ebi.ac.uk
>     <mailto:wm2 at ebi.ac.uk>> wrote:
>      > Hello,
>      >
>      > The default cutoff is 5000 bases.
>      >
>      > There is no parameter in the VEP itself, but there is a plugin
>     available that can be used to change the parameter.
>      >
>      >
>     https://github.com/ensembl-variation/VEP_plugins/blob/master/UpDownDistance.pm
>      >
>      > http://www.ensembl.org/info/docs/tools/vep/script/vep_plugins.html
>      >
>      > Regards
>      >
>      > Will McLaren
>      > Ensembl Variation
>      >
>      >
>      > On 4 February 2014 17:48, Genomeo Dev <genomeodev at gmail.com
>     <mailto:genomeodev at gmail.com>> wrote:
>      > Hi,
>      >
>      > Using VEP in Ensembl VM v74
>      >
>      > 1. I was wondering what distance cutoff does VEP use to assign
>     neighbouring genes to input variants.
>      > 2. Is there a parameter to handle that?
>      >
>      > Thanks,
>      >
>      > Genomeo
>      >
>      >







More information about the Dev mailing list