An Online Informatics Resource for Dictyostelium
  Search dictyBase:
use * as a wildcard character
Genome Browser BLAST dictyMart Stock Center Research Tools Help Links Contact Us

dictyBase Help: BLAST Searches


Contents



Description

BLAST stands for Basic Local Alignment Search Tool and was developed by Altschul et al. (1990). It is a very fast search algorithm that is used to search protein or DNA databases for sequence similarity. A fairly complete on-line guide to BLAST searching can be found at the NCBI BLAST Help Manual.

BLAST searches offered by dictyBase allow users to compare any query sequence to D. discoideum sequence data sets. To search any other (non-Dicty) data sets, NCBI BLAST can be used.

Using BLAST

dictyBase offers these five BLAST programs to accommodate different types of searches:

  1. BLASTN compares a nucleotide query sequence against a nucleotide sequence dataset.
  2. BLASTP compares an amino acid query sequence against a protein sequence dataset.
  3. BLASTX compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence dataset.
  4. TBLASTX compares the six-frame translations of a DNA sequence to the six-frame translations of a nucleotide sequence dataset.
  5. TBLASTN compares a protein query sequence against a nucleotide sequence dataset dynamically translated in all six reading frames (both strands).

Databases

    Chromosomal DNA

    The entries in this database are the full length chromosomes in dictyBase. In addition to 1, 2, 3, 4, 5, 6, and M, this includes 'floating contigs' which are long stretches of DNA that have been sequenced but have not been fit into an assembly yet.

    Primary Features

    These databases are an attempt giving the best quality sequence available for a given gene. A primary features corresponds to the best available sequence for a given gene with the following priority order:

    1. Curated Model
    2. Sequencing Center Gene Predictions
    3. GenBank Record
    In other words if Gene A is a curated model, that is the sequence that you would get (you would not get a Gene prediction for this gene, nor would you get any corresponding genbank records in this database). Gene B is a sequencing center gene prediction that is not linked to a known sequenced gene, in that case you would get a gene prediction sequence. Gene C is a gene from genbank that has not been mapped to a gene prediction, in that case you would get the genbank record.

    Sequencing Center Gene Predictions

    The entries in these databases are gene predictions sent to us by the Genome Consortium at the Wellcome Trust, Sanger Institute. In the case of the mitochondrial DNA, the models were obtained from GenBank. Sometimes these might be the result of predictions run by an individual center if there is an update which has not made it through the Sanger pipeline, but has made it into dictyBase.

    Curated Models

    The entries in these databases are manually curated by dictyBase. This database contains a relatively small subset of dicty genes, since only genes with certain types of evidence are ever upgraded to curated models.

    GenBank mRNA and Genomic Fragment Records

    These databases contain records in dictyBase corresponding to GenBank mRNA records or genbank genomic DNA records. This does not include EST records (see below) or genome records for long stretches of genomic DNA (mitochondrial DNA, chromosome 2 contigs).

    EST Records

    This database contains all EST sequences obtained from GenBank.

In addition there is a button 'BLAST at NCBI' that links out directly to NCBI BLAST with the sequence pasted into the query window.

Sequences for a BLAST search can be submitted by typing or pasting a sequence into the Query Sequence window. When the BLAST page is accessed from the right menu on a locus page, the sequence of that locus is pasted automatically into the window.

Options

  • Changing the E-Value determines the stringency of a BLAST search. A lower E-value increases the stringency (to be used if short and / or very A/T-rich sequences are submitted), a higher E-Value decreases the stringency of a search. The default is 0.1, which means no alignment with a value higher than 0.1 is displayed.
  • The Number of alignments to show determines how many alignments are displayed.
  • The default Word Size is 11 nucleotides for DNA and 3 amino acids for Proteins. Increasing the Word Size increases the minimal length of an identical match required.
  • The Matrix is a general purpose matrix. The BLOSUM matrix assigns a probability score for each position in an alignment that is based on the frequency with which that substitution is known to occur among consensus blocks within related proteins. BLOSUM62, the default, is among the best of the available matrices for detecting weak similarities. Other supported options are PAM30, PAM30, BLOSUM80, and BLOSUM45. Adjustments to the matrix may be in order when a search for very distant relatives of the query is being performed.
  • Filtering is ON by default and filters the query sequence for low complexity regions. In a protein search low complexity regions appear as X's in the alignment while in a nucleotide search they appear as n's. The score and E-value of a match may be affected slightly by filtering since it effectively shortens the query length. The DUST and SEG algorithms are used. For A/T-rich or other repetitive Dictyostelium sequences turning Filtering OFF might be desirable.
  • The default Gapped Alignment reports the best local alignments and is suitable for most applications. However, an ungapped search may be desirable when hits that align to the entire length of the query are most interesting. An ungapped search can be specified by checking the 'False' option.

Accessing the BLAST Search Page

BLAST can be accessed by selecting the hypertext link on the menu bar at the top of all dictyBase WWW pages, or through the "Sequence Analysis Tools" on the Locus Page. If accessed through a specific locus page the sequence will be automatically filled into the browser window.

Associated Glossary Terms:

Go to BLAST