[ensembl-dev] Source of refseq data in otherfeatures

Rishi Nag rn6 at sanger.ac.uk
Mon Jun 10 11:24:15 BST 2013


Hi Reece

> On Tue, Jun 4, 2013 at 6:41 AM, Thibaut Hourlier <th3 at sanger.ac.uk>  
> wrote:
>> Just to add to RIshi's answer; only the sequences from RefSeq which  
>> align the genome of interest are stored in the >>otherfeatures database
>
> Do you have details about how refseq data are loaded?

We use the web front end to download the RefSeq EST and selected cDNA data  
for a particular species. Once these are downloaded the alignments are run  
and the sequences that do align will be present in the Ensembl database.

Here is the procedure.

EST sequences
Go to the NCBI website at http://www.ncbi.nlm.nih.gov
In the search bar, select 'EST' as the database and specify the scientific  
name of your species as the organism query term.  eg. "Gallus  
gallus"[Organism]
Under the 'Send To' menu, choose File/FASTA/Default Ordering and save the  
generated file

cDNA sequences

Go to the NCBI website at http://www.ncbi.nlm.nih.gov
In the search bar, select 'Nucleotide' as the database and specify the  
scientific name of your species as the organism query term.  eg. "Gallus  
gallus"[Organism]
Under the search box at the top of the page click on the 'Limits' link
In the 'Exclude' section tick the boxes STSs,working draft,EPA and Patents
In the 'Molecule' section select mRNA
Click Search
Under the 'Send To' menu, choose File/FASTA/Default Ordering and save the  
generated file


Regards

Rishi



-- 
Using Opera's mail client: http://www.opera.com/mail/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130610/04e9cc41/attachment.html>


More information about the Dev mailing list