[ensembl-dev] Retrieve sequence coverage data for strain (resequencing results)

Wed Oct 10 12:04:05 BST 2012

Hi Jason,

The seq() method on strain_slice should serve you fine - this is all
that the web site is using to generate those views (along with methods
relating to the variants to provide clickable links for them).

If you add the "with coverage" flag, i.e. call $strain_slice->seq(1)
it will take into account read coverage also.

Cheers

Will McLaren
Ensembl Variation

On 9 October 2012 20:48, Jason Caravas <jacaravas at gmail.com> wrote:
> Hello,
>
> I'm attempting to build alignments of resequencing results from the 1000
> genomes project.  The end result I desire should closely resemble this view
> from the genome browser
> http://browser.1000genomes.org/Homo_sapiens/Location/SequenceAlignment?db=core;g=ENSG00000099984;gene=ENSG00000099984;r=22:24322339-24326106
> I am using Perl API version 65 to match the database.
>
> How do I obtain information on whether sequence data exists for a particular
> strain?  Using code like this
>
> my $sa = $registry->get_adaptor("human", "core", "slice");
> my $slice = $sa->fetch_by_region('chromosome', $chrom, $start, $end);
> my $str_slice = $slice->get_by_strain("AK1");
> my $indiv_seq = $str_slice -> seq ();
>
> simply returns the reference sequence for the region even though the online
> genome browser shows missing data for that particular individual.  It also
> doesn't complement variations on the -1 strand, so I'm assuming the seq()
> method is just a bulk application of SeqEdit::apply_edit() or
> AlleleFeature::apply_edit() methods to the reference sequence rather than a
> function to actually retrieve strain sequence data.
>
> So my question is twofold.  First, is going through the variations API and
> adding allele features to the reference the correct approach for
> reconstructing strain data?  Or is there a pre-existing method to do this
> that I am not aware of?  I am relatively new to the API and the presence of
> a view identical to my desired result in the online genome browser suggests
> that this could be the case.  Second, if this is the correct approach and I
> have to remove missing sequence data from my alignment myself, how do I
> determine where the missing data is for a given strain?
>
> Thanks in advance,
> Jason Caravas
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/