Day 11 part 3 micropollutant

Environmental Biotechnology

We start with a fundamental problem in environmental genomics:

We can’t understand what an organism can do if we don’t know what its genes actually do.

When scientists find new DNA sequences, they need to figure out:

  • What genes are present 🧬
  • What proteins they produce ⚙️
  • What functions those proteins perform 🔍

But this process is iterative — we must keep checking, predicting, and testing as new genes are discovered.


đź§© The Challenge of Microdiversity

In microbial communities:

  • Many organisms are low-abundance “tourists”, meaning they exist in very small numbers.
  • These rare species may have unique abilities, but we can barely detect them.
  • Most databases lack good protein annotations for such unknown species.
  • This leads to uncertainty in linking genetic identity → functional activity.

So microdiversity makes it very hard to map out what’s really happening in an ecosystem.


💥 The “Transitive Catastrophe”

Databases like GenBank and UniProt are open — anyone can upload gene sequences. But here’s the problem:

  • About 76% of microbial genome annotations are predicted — not experimentally verified.
  • Only 0.3% of all known genes have confirmed experimental evidence for their function!

If one early annotation is wrong, the mistake spreads — because new annotations are based on the old ones. This is called the transitive catastrophe ⚠️:

A false assumption gets copied again and again, creating an expanding chain of error.


🧬 The “One Gene–One Function” Idea (and Why It’s Wrong)

In the 1940s, Beadle and Tatum proposed the one gene → one protein → one function model, which earned them a Nobel Prize. 🏅 But modern biology shows it’s much more complex:

  • One gene can make multiple proteins through processes like alternative splicing or post-translational modifications.
  • One protein can perform different enzymatic functions, depending on the cell’s conditions.

This flexibility is part of what we now call epigenetics — how the same genome can produce very different outcomes.

🦋 Example: A caterpillar and a butterfly share the same DNA, yet look and act completely different because gene expression changes drastically between life stages.

Even bacteria can do this! They may look different or form filaments under certain conditions but still be the same species. So: morphology ≠ function.


🔬 The Omics Connection: DNA, RNA, Proteins, and Metabolites

To understand function, we combine multiple “omics” data sets:

Omics TypeWhat It Tells UsLimitation
Genomics (DNA)What could happenNo info on activity
Transcriptomics (RNA)Which genes are being expressedFluctuates quickly
Proteomics (Proteins)What workhorses are activeDifficult to detect all
Metabolomics (Metabolites)What results from activityHard to trace back to genes

💡 Idea: Combine DNA (potential) + proteins (activity) to connect who’s there and what they’re doing.


⚗️ How to Identify Active Genes in Pollution Degradation

Scientists test this by:

  1. Setting up two systems – one with a micropollutant and one without.
  2. Extracting and digesting proteins into peptides.
  3. Running them through mass spectrometry.
  4. Comparing the intensity of peptide peaks.

If a protein’s signal is stronger with the pollutant → it’s upregulated If weaker → downregulated


🌋 The Volcano Plot

Data visualization tool for proteomics results:

  • X-axis: Fold change (how much protein level increased or decreased)
  • Y-axis: Significance (probability)

Only proteins that change >2-fold (logâ‚‚) and are statistically significant are interesting.

These proteins are then matched to the genome to figure out:

  • Which enzymes they are đź§«
  • What pathways they might belong to đź”—

đź§® Using Databases (KEGG and Others)

After identifying candidate proteins, scientists use databases like KEGG (Kyoto Encyclopedia of Genes and Genomes):

  • Maps each protein to metabolic pathways
  • Shows which enzymes are upregulated
  • Helps predict which reactions are active

đź§Ş Example: When studying Gemfibrozil (a cholesterol-lowering drug), researchers saw certain amino acid synthesis enzymes upregulated. This hinted that microbes might be degrading the compound or using it to make new proteins.


đź§­ Predicting Pathways

Databases can simulate degradation routes:

  • Show possible breakdown products
  • Rate each pathway’s likelihood (color-coded: likely → unlikely)
  • Identify which enzymes could perform each step

If a specific metabolite is also found experimentally (via metabolomics), that pathway is confirmed âś….

Result: We can build a predicted gene pathway for how microbes degrade pollutants!


đź§© In Summary

ConceptMeaning
Microdiversity problemRare microbes are hard to analyze
Transitive catastropheWrong annotations get copied and amplified
One gene–one function is outdatedGenes and proteins are multifunctional
EpigeneticsSame DNA, different expression outcomes
Omics integrationDNA = potential, Proteins = activity, Metabolites = result
Volcano plotVisualizes up/downregulated proteins
KEGG mappingLinks proteins to metabolic functions
Pathway predictionReveals how pollutants are degraded

Quiz

Score: 0/30 (0%)