Nomenclature Guidelines

Nomenclature Guidelines

[Summary]

Dictyostelium discoideum has a genome of approximately 34 Mb, containing approximately 12,500 genes. Thousands of mutant strains have been obtained, many of which are available from the Dicty Stock Center. A uniform nomenclature is essential to compile all the available information and provide easy access for the research community. We therefore encourage researchers to conform to the following guidelines for naming Dictyostelium genes, proteins, mutant alleles, strains, phenotypes, genotypes, plasmids, molecular genetic constructs, and chromosomes.

See the Procedure for naming genes for more details.

Questions and comments should be addressed to dictybase@northwestern.edu.

[TOP]


Gene Nomenclature Guidelines

Dictyostelium gene names should be in lower case, italicized letters, followed, when necessary by a capital italicized letter or a number to distinguish genes sharing a prefix.

We strongly discourage using "D", "d", or "Dd" for Dictyostelium, and "g" and "p" for "gene" and "protein", as these abbreviations are not informative.

  1. Established nomenclature for gene families has precedence over other names.
    Examples: abcA1, abcA2, abcB1, atg1, atg4. This includes Dictyostelium homologs of human genes for which a nomenclature has been established by the HGNC, or for established gene names of non-mammalian homologs, usually if there are no mammalian counterparts or if the nomenclature is better established in non-mammalian organisms.
    • 1:1 human orthologs are named using the established human name, e.g. eif3f, nmd3, dgat1.
    • Many-to-one homologs: If Dictyostelium has only one gene that is similar to a group of human genes, this gene is named dropping the numbers or letters that distinguish the group members. For example, human has PCBD1 and PCBD2, and the single Dictyostelium gene is named pcbd.
    • One-to-many homologs: If Dictyostelium has and expanded group of genes compared to other organisms with established nomenclature, additional letters or numbers maybe added to distinguish the members. For example, while human has two cytochrome b5 genes and yeast has one, Dictyostelium has 3 family members, named cyb5A-C.
    • Many-to-many homologs: It is possible to mix letter and number suffixes to distinguish cases where some members of a family are orthologs and other are Dictyotelium-specific paralogs; for example, rab21, rab7A, rab7B, are orthologs of mammalian RAB21 and RAB7, respectively, while rabH does not correspond to any mammalian RAB.
    • Exceptions where human nomenclature is discouraged are when the human name makes no sense for Dictyostelium, for example OPA3 (Optic Atrophy 3), or TEX2 (Testis Expressed 2). As more functional data becomes available, HGNC as well as UniProtKB are working to improve nomenclature.
    If it is not possible to identify an appropriate name, for the time being leave the DDB_G identifier as the gene name.

  2. New gene names, especially for non conserved genes, should conform to the following conventions derived from Demerec et al. (1966). A locus description consisting of three lower-case, italicized letters, followed by a capital italicized letter to distinguish genes with the same descriptor that are related in a significant way.
    Examples: rdeA, rdeB and rdeC; or tagA, tagB and tagC.
    For large gene families (> 26 members), we encourage the use of numbers rather than letters.

  3. Existing gene names remain unchanged.
    Examples: act15, mhcA, and pyr5-6.

  4. Original naming authors can change any gene name by describing the name change in their next publication containing the gene and informing dictyBase.
    Example: There were several publications about the spore coat protein SP70, and even after the gene has been cloned it was still referred to as SP70. However, in the early 90ies the gene name cotB was introduced by Bill Loomis.

  5. All names that are found in the literature will remain in dictyBase as synonyms.
    Example: pkaC has five synonyms: PKA, pkacat, DdPK3, DdPK2, PKA-C.

[TOP]


Protein Nomenclature Guidelines

  1. A protein may be named after the gene encoding it by capitalizing the first letter and without the use of italics.
    Examples: RegA (encoded by regA), RegA(D212A) (encoded by regA(D212A)).

  2. A protein can also be referred to by its full name or a protein synonym.
    Examples: actin, ribonucleotide reductase small subunit, RNR, protein kinase C, PKC.

  3. Proteins that have been identified by physical properties are sometimes named following this property; however this practice is discouraged because many different proteins can have the same attribute.
    Examples: p34, 34kDa protein, actin-binding protein, calcium-binding protein.

[TOP]


Mutant Allele Nomenclature Guidelines

  1. Allele names should be italicized and placed in parentheses following the gene name without a space.
    Examples: yakA(AK235) and yakA(AK800): different insertion mutations in the yakA gene.

  2. Where the nature of the mutation is not known or only a single allele is relevant, a superscript minus sign can be used for brevity.
    Example: regA-.

  3. The use of superscripts to describe the general nature of an allele can be used but should be limited to two or three letters, e.g. ts (temperature sensitive), cs (cold sensitive), hc (high copy) or dn (dominant negative).
    Examples: regA(AK202ts), regAts.

  4. Amino acid substitutions should be given as the old residue in single-letter code with its codon location followed by the new (mutant) amino acid. The amino acid change is an interpretation of what the gene will produce and refers to a protein so it is not put in italics.
    Example: regA(D212A) has an alanine in place of the aspartate at position 212.

  5. NOTE: AX4 is the reference strain for the “wild-type” amino acid sequence within proteins since it is the first strain that has been sequenced.

[TOP]


Strain Nomenclature Guidelines

Strains are annotated with both a Systematic Strain Name and a Strain Descriptor.

  • Systematic Strain Name

  • Strains must have an unambiguous name, consisting of 2 or 3 capital letters plus a unique serial number (e.g. HM1 or HTY217). Labs or workers should always use the same capital prefixes or small group of prefixes. The prefixes are assigned by a clearinghouse upon request. See Appendix 1 for a current list of assigned strain prefixes. In the past all lab designations started with either H (for haploid) or D (for diploid) and many of these strains prefixes are still in use.In cases where no systematic name is provided, dictyBase will automatically assign a systematic strain name consisting of the letters DBS (dictyBase Strain) followed by seven digits.
  • Strain Descriptor

  • The strain descriptor is meant to provide a quick overview of the key genetic modifications that produced the strain, including the gene name, the promoter, the mutations, and tags or reporter genes. The strain descriptor appears on the Gene Page and on the Phenotype and Strain Details page and is meant to give a more descriptive account of strains compared to systematic strain names. The format of the strain descriptor is as follows:

    gene-/[promoter]:gene(substitution or truncation):marker

    where

    -    represents an endogenous mutant allele, typically a null or a functional null mutant
    /    represents a compound mutant
    [ ]  represents the promoter gene
    ( )  indicates a substitution and/or a truncation
    :    represents a fusion between two genes

    Additional annotations may appear in brackets as follows:

    [unk] represents a gene fusion to an unknown promoter
    [OE] designates an overexpressor under the control of an unspecified promoter
    [KD] designates an knockdown strain when the method is unknown
    [AS] designates an antisense strain
    [RNAi] designates an interference RNA strain
    [inviable] precedes strains described as inviable
  • Strain ID

  • All strains curated by dictyBase also have a strain ID consisting of the letters DBS (dictyBase Strain) followed by seven digits. This ID is created automatically and does not change. If there is no other Systematic Strain Name, both the Strain ID and the Systematic Strain Name will be the same.

[TOP]


Phenotypes

dictyBase is using a vocabulary based on PATO (Phenotypic Quality Ontology) for phenotype annotations. Some documentation is available at http://www.bioontology.org/wiki/index.php/PATO:Main_Page. Phenotype terms are composed of two terms: an entity (biological process or anatomical structure abnormal in the mutant) and a quality that describes the abnormality. For example, a mutant that exhibits a delay in the aggregation process will be annotated to "delayed aggregation." The complete list of terms used to annotate Dictyostelium phenotypes can be viewed here: http://dictybase.org/Downloads/dicty_phenotypes.html. We encourage researchers to use this vocabulary in publications to ensure that genes get annotated accurately.

[TOP]


Genotypes

Genotypes represent the genetic modifications present in a given strain. Genes listed in a genotype are considered to be mutant in some way. Every genetic element described in a genotype (gene, plasmid, DNA fragment) should be separated by a comma. To distinguish chromosomal loci from exogenous genes, every gene or construct contained on DNA that was introduced into cells by transformation should be listed within brackets regardless of whether that DNA is carried on a plasmid, integrated as a fragment or was amplified by replication once inside cells.

Constructs carried within cells should be described in the formal genotype of that strain and placed in brackets. Thus, a cotB promoter beta-galactosidase reporter construct in a mutant strain might have a formal genotype of “regA(AK202), [cotB/lacZ], neoR”, but could be shortened to “regA-[cotB/lacZ]” in the context of a sentence describing the strain's properties in the results section of a paper.

The following names should be used to indicate common drug resistance/sensitivity and nutrient auxotrophy/prototrophy. For clarity, relevant phenotypic markers should be included at the end of formal genotypes (no italics):

Drug resistance:

  • bleR: bleomycin resistance
  • bsR: indicates the blasticidin resistance gene bsr is present
  • bsS: for clarity, or when bsr has been introduced but is non-functional
  • foaR: most often this will be redundant with ura-
  • hygR: hygromycin resistance
  • neoR: indicates a neomycin phosphotransferase gene is functional
  • neoS: for clarity, or when G418 selection on a neoR strain is relaxed and G418-sensitive strain is isolated

Auxotrophic markers:

  • thy-: indicates thy1 is mutant and the strain requires thymidine
  • thy+: used when a thy1 mutant has been complemented
  • ura+: uracil independent Note that this can be different from pyr5-6+
  • ura-: most often used to indicate pyr5-6-

[TOP]


Plasmids

Naturally occurring plasmids are named by a prefix indicating the genus and species, as in Ddp1. Derivatives of the natural plasmids and other shuttle vectors should be indicated with a lowercase “p” prefix (pDXA-3C). For genes on a plasmid, and any other gene that is introduced into a strain by experiment (see below), use the same naming system as for chromosomal genes, but placed within square brackets.
Example: pDneo67[act6/regA] indicates that the regA coding sequence is fused to the promoter of the actin6 gene on plasmid pDneo67).

[TOP]


Molecular Genetic Constructs

Reporter genes and gene fusions should be named with the relevant components separated by slashes and dashes to indicate DNA fusion, as follows. Typically, the components will be a promoter, a coding sequence or ORF encoding a reporter protein. Promoters (± a few codons) should be separated from coding sequence by a slash (/) and coding sequences separated by dashes (-).
Examples: cotB/talA-GFP (talin-GFP fusion under the transcriptional control of the cotB promoter), talA/talA-GFP or talA-GFP (talin gene under the transcriptional control of the native promoter), talA-GFP(S65T).

[TOP]


Chromosomes

Chromosomes are designated by non-italic Arabic numbers. Example: Chromosome 1.

[TOP]


Management

dictyBase acts as the centralized clearinghouse for gene and strain names. Scientific curators at dictyBase will verify proposed gene and strain names to encourage the application of these guidelines and to ensure that names are not duplicated. Questions or naming suggestions can be addressed to: dictybase@northwestern.edu.

[TOP]


References

This document is based on the Nomenclature Proposal written in November 2000 by the Dictyostelium Community Organizing Committee: Adam Kuspa (Chairman), Robert R. Kay, Alan R. Kimmel, Hideko Urushihara, Richard Kessin (ex officio) Chairman.

General:

Demerec, M. et al., (1966). A proposal for a uniform nomenclature in bacterial genetics. Genetics 54:61-76.

Kay, Loomis, Devreotes (2001) TIG S.5-S.6

Gene Names List: Dictyostelium discoideum gene list at dictyBase

[TOP]


Appendix 1. List of Known Strain Prefixes by Laboratory

Updated March 27, 2007

AD, HAD Adrian Harwood
AK Adam Kuspa
ARK Alan Kimmel
BS Bubba Singleton
BW Bin Wang (Kuspa lab)
CT Chris Thompson
CW Tom Egelhoff
DG Bill Loomis (Developmental Gene)
DH Dale Herald (except DH100-199: Rich Kessin)
DR Doug Robinson
GS Gad Shaulsky
HAD Adrian Harwood
HC, DCB Barrie Coukell
HDK, DDK David Knecht
HDT Meg Titus
HG, DG Guenther Gerisch
HGR Michel Sartre
HH, DH100-199 Rich Kessin (Haploid Harvard)
HJW, JGW Jeff Williams
HK, DK Gene Katz
HKT Kei Inouye
HL, DL Bill Loomis
HM, DM Rob Kay
HMW Randy Dimond (Haploid Madison Wisconsin)
HO Terry O'Halloran
HP, HPX Pasteur Institute
HPF, DPF Paul Fisher
HPS, DPS Reg Deering (Haploid Penn State)
HR Herb Ennis and Rich Kessin (a few strains from Frank Rothman were labeled HR too)
HS, DS Jim Spudich
HSB Salvo Bozzaro
HT Adrian Tsang
HTU Taro Uyeda
HTY Kaichiro Yanagisawa
HU, DUK Keith Williams
HUD, DUD Dennis Welker
HW Chris West
IIB Instituto de Investigaciones Biomedicas
IR, DIR, RI (old) Rob Insall
JB Jane Borleis (Peter Devreotes' lab)
JGW, HJW Jeff Williams
JH Jeff Hadwiger
JM Jacqueline Milne (Peter Devreotes' lab)
JS Justin Stege (Bill Loomis' lab)
KS Karl Saxe
KY Kaichiro Yanagisawa (or T. Yamada?)
LW Lijun Wu (Peter Devreotes' lab)
NP, DP Peter Newell
PD Peter Devreotes
QS Queller-Strassmann lab
RI Rob Insall (old prefix)
SA, DSA Steve Alexander
SB Simone Blagg
TL Bill Loomis
V Adam Kuspa
W Adam Kuspa
WTC Wen-Tsan Chang
X, XP Peter Newell
XMC Hideko Urushihara (XP55-derived MaCrocyst defective)



Updated on September 4, 2009

[TOP]

Home| Contact dictyBase| SOPs| Site Map  Supported by NIH (NIGMS and NHGRI)