Genome Mining

Genome mining workflow

Discovering the Biosynthetic Potential of Bacteria

The ability to predict the biosynthetic potential of microorganisms from their genome sequences (“from genes to molecules”) and the ability to predict which genes should code for biosynthesis of a particular natural product (“from molecules to genes”) has the potential to revolutionize drug discovery and development efforts:

1) With the explosion of microbial genome sequences available, it has become clear that the biosynthetic potential of microorganisms is much higher than what you see through fermentation. A typical, natural product-producing bacterium will be known to make a few compounds, but ~30 may be predicted from its genome sequence. Why this discrepancy? Because most biosynthetic genes are silent or not well expressed under laboratory growth conditions. But knowing which genes are there (knowing what the potential is) gives you the chance to activate those genes and get the corresponding compounds.

2) Imagine you have a strain collection of 1,000s to 100,000s of strains. The ability to predict what types of natural products they can potentially make through genome mining, gives you the opportunity to select potentially “talented” strains to be included in the discovery pipeline.

3) Identifying the genes that encode the biosynthesis of a natural product of interest gives you the opportunity to engineer the biosynthesis for yield improvement or structure diversification.

Summary:

The concept here is that one can go from genes to molecules and vice-versa. In other words, you can “read” the DNA sequence of an organism using bioinformatics and predict which natural products can be made. The image will be fuzzy, you won’t know the exact structure but you can have a rough idea or at least tell the biosynthetic class. The predictions can then guide you in deciding which strains or environments to focus on. You can also go the other way, if you have a natural product structure, you can predict which type of genes would code for that molecule, which in turn helps you in identifying those genes. Once biosynthetic genes are identified, the respective enzymes can be studied, the strains can be engineered to generate structural modifications or to increase production yields.

References

  1. Braesel J, Crnkovic CM, Kunstman KJ, Green SJ, Maienschein-Cline M, Orjala J, Murphy BT, Eustáquio AS.* (2018) Complete genome of Micromonospora sp. strain B006 reveals biosynthetic potential of a Lake Michigan actinomycete. J Nat Prod 81: 2057-2068.
  2. Braesel J, Arnould B, Lee J, Murphy BT, Eustáquio AS.* (2019) Diazaquinomycin biosynthetic gene clusters from marine and freshwater actinomycetes. J Nat Prod 82: 937-946.
  3. Kornfuehrer T, Romanowski S, de Crécy-Lagard V, Hanson AD, Eustáquio AS.* (2020) An enzyme containing the conserved Domain of Unknown Function DUF62 acts as a stereoselective (Rs ,Sc)-S-adenosylmethionine hydrolase. Chembiochem 21:3495-3499.
  4. Crnkovic CM, Braesel J, Krunic A, Eustáquio AS, Orjala J. (2020) Scytodecamide from the cultured Scytonema sp. UIC 10036 expands the chemical and genetic diversity of cyanobactins. Chembiochem 21:845-852.