Lecture 1 Chap 1 PN Book

Protein chemistry

Introductory note + Aims of the chapter 🎯

  • The chapter is a chemistry refresher specifically targeted at understanding proteins—not a full chemistry course.
  • Goal: help you understand proteins as 3D molecules and the molecular forces that shape:
    • protein folding (structure)
    • protein interactions (binding)
    • protein function
  • It starts from amino acids (building blocks), builds through levels of protein structure, then explains forces stabilizing structures/interactions, and ends with a compendium of key chemical structures to learn.

1.1 The different aspects of protein structure 🧩

Proteins are too complex to describe only in 2D. So we define structure at four levels:

  1. Primary structure: the amino acid sequence (covalent structure), including post-translational modifications (PTMs) like phosphorylation or glycosylation.
  2. Secondary structure: local folding patterns—α-helices, β-sheets, turns.
  3. Tertiary structure: overall 3D fold of one polypeptide chain (brings far-apart residues close in space).
  4. Quaternary structure: assembly of multiple subunits into one functional protein.

Key idea: function depends on structure at all these levels—and they’re linked.


1.2 The constituents of proteins: amino acids 🧪💧

General amino acid structure

  • Standard amino acid: NH₂–CαH(R)–CO₂H (Fig. 1.1).
  • In water near neutral pH, amino acids are mostly zwitterions:
    • NH₃⁺ and COO⁻ at the same time (Fig. 1.2).
  • Side-chain carbon atoms are labeled β, γ, δ, ε moving away from Cα (useful for naming groups like “γ-carboxyl of Glu” or “ε-amino of Lys”).

20 common amino acids + classification by polarity 🌊🛢️

A practical way to predict protein behavior is classifying side chains by polarity (water-loving vs water-avoiding):

  • Non-polar (hydrophobic): Ala, Gly, Ile, Leu, Met, Phe, Pro, Trp, Val
  • Polar, uncharged: Asn, Cys, Gln, Ser, Thr, Tyr
  • Polar, charged: Arg, Asp, Glu, His, Lys

A more quantitative approach uses a hydrophobicity scale (Table 1.2), based on free energy of transfer (membrane interior → water) for residues in an α-helix:

  • Strongly hydrophobic examples: Phe, Met, Ile, Leu, Val
  • Strongly hydrophilic examples: Arg, Asp, Lys, Glu

1.2.3 General properties of amino acids

(a) Stereochemistry 🔄

  • Most amino acids (except Gly) are chiral (Cα has 4 different substituents).
  • Proteins use L-amino acids.
  • Two stereochemistry systems:
    • D/L system based on glyceraldehyde (Fig. 1.3)
    • R/S system: absolute configuration rules (Fig. 1.4)
  • “Almost all L amino acids are S at Cα” (with specific notes about extra chiral centers in Ile and Thr).
  • D-amino acids exist in nature sometimes (e.g., bacterial cell walls; special peptide synthesis systems).

(b) Ionization + pI ⚡

  • Typical pKa values:
    • α-COOH ~ 2.2
    • α-NH₃⁺ ~ 9.5
  • Side-chain ionization depends on residue (Table 1.1).
  • Isoelectric point (pI) = pH where net charge = 0.
    • No ionizable side chain: pI ≈ average of α-COOH and α-NH₃⁺ pKas (example given for Gly).
    • With ionizable side chain: pI is the average of the two pKas that surround the zwitterion (example given for Asp).
  • Important protein reality check: side-chain pKas can shift a lot inside proteins depending on the local environment (examples given: Glu in lysozyme, Tyr in dehydroquinase, Ser in chymotrypsin).

1.2.4 Chemical characteristics of amino acids (grouped by side chain chemistry)

This section explains why particular residues behave the way they do in proteins.

  1. Aliphatic (Ala, Gly, Ile, Leu, Val)
    • Chemically unreactive; key in hydrophobic interactions.
    • Gly: tiny → high flexibility, important for tight turns/packing.
  2. Aromatic (Phe, Tyr, Trp) 🌑
    • Absorb UV:
      • Phe ~ 258 nm, Tyr ~ 275 nm, Trp ~ 280 nm (used in protein quantification).
    • Phe: mostly hydrophobic/unreactive.
    • Tyr: hydroxyl makes it more reactive, can H-bond, can be phosphorylated, can undergo electrophilic substitution (iodination → thyroxine).
    • Trp: limited reactivity; N–H can H-bond.
  3. Basic (Arg, Lys)
    • Usually positively charged at neutral pH → ionic interactions with Asp/Glu or phosphates.
    • Lys: pKa ~10.5; neutral form (–NH₂) is a strong nucleophile.
    • Arg: pKa ~12.5; guanidinium stays protonated in physiology.
  4. Acidic (Asp, Glu)
    • pKa ~4 → usually negatively charged at neutral pH.
    • Bind positive groups and metal ions (e.g., Ca²⁺, Zn²⁺).
  5. Hydroxyl (Ser, Thr)
    • H-bonding; very high pKa (~15) so weak nucleophiles normally.
    • Can become powerful nucleophiles in special catalytic setups (e.g., serine proteases).
    • Often phosphorylated (regulation).
  6. Amides (Asn, Gln)
    • Always uncharged; excellent H-bond donors/acceptors.
  7. Sulfur-containing (Cys, Met) 🧷
    • Cys: nucleophilic sulfur; pKa ~8.4 so some exists as thiolate (S⁻) near neutral pH.
      • Can form disulfide bonds (cystine) under oxidizing conditions (common in secreted/extracellular proteins).
      • Can coordinate metals (Zn²⁺/Fe²⁺) and is a target for heavy metals (Hg, Cd, Pb).
      • Can be oxidized to sulfenic/sulfinic/sulfonic forms (oxidative stress relevance).
    • Met: less reactive sulfur, but forms S-adenosylmethionine (SAM/adoMet) → major methyl donor (e.g., DNA methylation); can oxidize to Met sulfoxide.
  8. Proline 🧱
    • Cyclic side chain → strong geometric constraints.
    • Special behavior in peptide bonds (important later).
    • Hydroxylation → 4-hydroxyproline in collagen (stability; vitamin C link via prolyl hydroxylase).
  9. Histidine 🎚️
    • Imidazole pKa ~6 → near-neutral pH can switch between protonated and neutral forms.
    • Frequently in enzyme active sites: acid–base catalysis, nucleophilicity, metal binding.

1.2.5 Structure of water 💧

  • Water geometry: H–O–H angle ~104.5°, O–H bond length ~0.096 nm.
  • Partial charges create strong polarity and enable hydrogen bonding.
  • Each water can form up to 4 H-bonds (two donors, two acceptors), tetrahedral arrangement (Fig. 1.5).
  • Ice has an open network → lower density (floats).
  • Liquid water is dynamic (clusters; ~3.5 H-bonded neighbors on average).
  • Water is great for charged/polar molecules but poor for non-polar ones → sets up the hydrophobic effect later.

1.3 Primary structure of proteins 🧬

Primary structure = amino-acid sequence plus the reality of biology:

  • gene → mRNA (introns spliced out)
  • translation
  • PTMs (add/remove groups, proteolysis, etc.)

1.3.1 The peptide bond 🔗

  • Formed by condensation: carboxyl + amino → amide (peptide) bond + H₂O (Fig. 1.6/1.7).
  • Chains are written N-terminus → C-terminus.
  • Peptide bond has partial double-bond character due to resonance (Fig. 1.8):
    • makes the peptide unit planar
    • creates a permanent dipole (partial charges on O/C/N/H)
  • Cis/trans isomerism (Fig. 1.9):
    • trans usually much more stable than cis (steric clash in cis).
    • Exception: bonds involving proline have a much smaller trans preference → ~5% of Xaa–Pro bonds can be cis.
  • Rotation around peptide bond has a high barrier → proline isomerization can be a slow step in folding, helped by peptidyl–prolyl isomerase.

1.3.2 What you can learn from sequence information 📏🔍

  1. Exact molecular mass
    • Sum amino-acid masses (Table 1.1) + H₂O; compare to mass spectrometry for identity/integrity.
  2. Isoelectric point (pI)
    • Determined by counts of charged groups (N-terminus, C-terminus, Asp/Glu vs Lys/Arg/His).
    • Predicted pI is approximate because pKas shift in proteins, but still useful (e.g., predicting behavior in electrophoresis/ion-exchange).
  3. Absorption coefficient at 280 nm (extinction coefficient)
    • Mostly from Tyr and Trp (plus small disulfide contribution).
    • Used to estimate protein concentration.
  4. Hydrophobicity + aliphatic index
    • Overall polar/non-polar balance can hint at solubility or membrane association.
    • Aliphatic index tracks Ala/Ile/Leu/Val abundance.
  5. Post-translational modifications (Table 1.3) Key PTMs and functional consequences:
    • Proteolysis: targeting sequence removal, hormone generation, enzyme activation
    • Disulfides: stabilize secreted proteins
    • Hydroxylation: stabilizes collagen triple helix
    • Glycosylation: cell recognition, anchoring via GPI, increased polarity
    • Phosphorylation (Ser/Thr/Tyr): regulation/signaling
    • N-terminal acylation: myristoylation/palmitoylation → membrane association Mass spectrometry is highlighted as a way to detect PTMs by mass differences (example: lysozyme disulfides causing a predictable mass shift).
  6. Structural and functional motifs 🧷 Sequence patterns can predict:
    • Transmembrane helices via hydropathy plots (threshold example: +84 kJ/mol for a 20-aa stretch).
    • Targeting/retention sequences (peroxisome, ER retention signals).
    • Metal-binding motifs (example consensus for 2Fe–2S clusters).
    • Kinase phosphorylation consensus sequences.
    • N-glycosylation motif: Asn–X–Ser/Thr (X rarely Pro or Asp).
  7. Sequence relationships 🌳
    • Comparing homologs across species reveals conserved residues (often functional/structural essentials).
    • Gene duplication + divergence explains related protein families.
    • Domain architecture can be inferred (example: steroid hormone receptors with distinct domains).

1.4 Secondary structure of proteins 🌀📄

Because the peptide bond is planar, chain flexibility mainly comes from rotation around the bonds at Cα, defined by dihedral angles:

  • φ (phi): N–Cα bond rotation
  • ψ (psi): Cα–C bond rotation
  • Ramachandran plot shows allowed combinations due to steric constraints (Fig. 1.12).
  • Glycine has more allowed space (tiny side chain); bulky residues have less.

1.4.1 α-helix 🌀

  • Right-handed helix for L-amino acids.
  • Hydrogen bonds: carbonyl of residue n bonds to N–H of residue n+4 (shown in Fig. 1.13).
  • Ideal geometry:
    • 3.6 residues/turn
    • pitch 0.54 nm
    • φ ≈ −57°, ψ ≈ −47°
  • Has a helix dipole (N-terminus +, C-terminus −) that can influence interactions.
  • Proline disrupts helices (can break/kink) because of its rigid ring geometry.
  • Helices in proteins are often short (~12 residues) and slightly distorted from ideal.
  • Helical wheel projection (Fig. 1.14) helps visualize amphipathicity:
    • One face polar, one face non-polar → helix can sit at membranes.
    • A ~21–22 aa helix spans a lipid bilayer thickness (~3.2 nm).

Other helices:

  • 3₁₀ helix: tighter; H-bond n → n+3; often near helix ends.
  • π helix: looser; H-bond n → n+5; rare.

1.4.2 β-sheets 📄

  • Built from extended β-strands; side chains alternate above/below sheet.
  • Two arrangements:
    • Parallel
    • Anti-parallel (more common; needs fewer strands; H-bonds less distorted) (Fig. 1.15)
  • Strand length: ~6 residues typical in proteins.
  • β-sheets have a right-handed twist (electrostatics between carbonyls contributes).
  • Membrane proteins can also be β-structured (porins example: 16 β-strands forming a pore).

1.4.3 Other structural features: β-turns + polyproline 🔁

  • β-turns allow compact globular folding (chain reverses ~180°).
  • Many types; common ones are 4 residues long (Fig. 1.16).
  • Proline often found in turns or at helix/sheet ends.
  • Polyproline structures:
    • Poly-Pro I: all cis; right-handed helix
    • Poly-Pro II: all trans; extended left-handed helix
    • Poly-Pro II segments show up in real proteins more than you might expect.

1.4.4 Amino acid preferences for secondary structure 📊

From real protein structures and model polymers:

  • Helix-friendly: Met, Glu, Leu, Ala
  • Helix-unfriendly: Pro, Gly, Tyr
  • Sheet-friendly: Val, Ile, Phe
  • Sheet-unfriendly: Pro, Asp
  • Turn-friendly: Pro, Gly, Asp
  • Turn-unfriendly: Met, Val, Ile Secondary structure prediction accuracy is useful but limited (~70%); helices are easiest to predict.

1.5 Tertiary structure of proteins 🏗️

Tertiary structure = long-range folding of one polypeptide chain.

1.5.1 General principles (common themes across many proteins)

  • Close packing: interiors are tightly packed with few cavities; cavities often contain water.
  • Secondary structure elements persist, but with distortions (e.g., α-helices may shift toward 3₁₀-like geometry).
  • Side-chain distribution:
    • Hydrophobic residues mostly buried
    • Polar residues mostly surface-exposed
    • Exception: membrane-spanning regions invert this logic.
  • Internal polar groups must be satisfied:
    • hydrogen bonds to other groups or internal waters.
  • Domains:
    • Large proteins often consist of domains (~100–150 aa), independently folded modules linked to functions (binding/catalysis).
    • Domains can sometimes be isolated by proteolysis or expressing part of the gene.

1.5.2 Classification of protein structures 🗂️

Major structural classes:

  • all-α
  • all-β
  • αβ (sometimes split into α/β vs α+β depending on mixing pattern along the chain)

SCOP database is highlighted as a classification tool:

  • folds (overall arrangement)
  • superfamilies (probable relationships)
  • families (strong evidence; often >30% sequence identity)

Estimates and big idea:

  • ~1000 folds exist.
  • A small set of “superfolds” account for a large chunk of structures (>30%).
  • Examples shown (Fig. 1.17):
    • TIM barrel
    • α/β doubly wound
    • Greek key

1.5.3 What stabilizes tertiary structure?

Mostly weak non-covalent forces, discussed in 1.7 (disulfides are a special covalent stabilizer in secreted proteins).


1.6 Quaternary structure of proteins 🤝

Quaternary structure = assembly of multiple polypeptide chains (subunits).

General tendencies:

  • < ~30 kDa often monomeric
  • ~50 kDa often oligomeric (rule-of-thumb, not absolute)

Types:

  • Homooligomer: same subunit repeated (e.g., homotetramer α₄)
  • Heterooligomer: different subunits (e.g., αβγ)

Why oligomerize?

  1. Regulation/communication between subunits (classic example: cooperative O₂ binding).
  2. Increased stability (interfaces stabilize folding).
  3. New structures with cavities:
    • chaperones (folding chambers)
    • proteasomes (degradation chambers) Some assemblies can include disulfide links (e.g., some immunoglobulins).

1.7 Forces contributing to protein structure and interactions ⚖️✨

Proteins are covalently linked by peptide bonds, but folding and binding are dominated by weak, non-covalent forces.

1.7.1 Ionic (electrostatic) interactions ➕➖

  • Between opposite charges (Asp/Glu vs Lys/Arg/His).
  • Energy depends on distance r and dielectric constant D:
    • E = q1q2 / (D r)
  • Water has high D (~78.5) → surface ionic interactions are often weak (~5 kJ/mol).
  • Buried ionic interactions in low-D environments can be much stronger (up to ~20 kJ/mol).
  • Protein interiors have variable effective D depending on water penetration (values discussed).

1.7.2 Hydrogen bonds 🧷

  • Interaction of H attached to electronegative atom (donor) with another electronegative atom (acceptor).
  • Typical strength ~5–10 kJ/mol.
  • Key feature: directionality/geometry → provides specificity.
  • Ideal geometry: A–H–B nearly linear; A–B distance ~0.30 ± 0.05 nm.
  • Important note: H-bonds can be “competed away” by water—so net stabilization depends on context.

1.7.3 van der Waals interactions 🌫️

  • Includes dipole–dipole, dipole–induced dipole, induced–induced dipole.
  • Individually weak (~5 kJ/mol), but numerous → collectively significant.
  • Reflects a balance of attraction at longer range and repulsion at short range.

1.7.4 Hydrophobic interactions (hydrophobic effect) 🛢️➡️💧

  • Water forms ordered cages around non-polar groups → entropy penalty.
  • Non-polar groups tend to cluster to reduce exposed surface → drives folding.
  • These interactions contribute the largest energy component stabilizing folded proteins in water.
  • Temperature effect: can weaken at low temperature → explains cold denaturation in some proteins.
  • Agents affecting it:
    • Urea / guanidinium chloride weaken hydrophobic interactions (disrupt water H-bond network).
    • Some salts strengthen hydrophobic interactions (Hofmeister series listed for ions).

1.7.5 Balance of energy contributions 🎭

Protein stability is a delicate thermodynamic balance:

  • Folding:
    • big unfavorable entropy decrease (chain becomes ordered)
    • big favorable enthalpy decrease (many internal interactions form)
  • Net stability is surprisingly small: typically 20–60 kJ/mol, even though the underlying terms are each hundreds of kJ/mol.
  • Proteins can be unfolded by:
    • denaturants (urea, GdmCl)
    • extreme pH (changes ionization)
    • temperature shifts (changes interaction balance)

1.7.6 Range of binding strengths (Kd) 🎯

Different biological functions require different binding tightness. A table gives typical Kd ranges and corresponding free energies (examples):

  • Very tight: avidin–biotin (~10⁻¹⁵ M)
  • Protein–protein (~10⁻¹⁰ M)
  • Antibody–antigen (~10⁻⁹ M)
  • Receptor–hormone (~10⁻⁷ M)
  • Enzyme–substrate (~10⁻⁵ M)

Important interpretation:

  • Strong enough for biological action, but not always “too strong.”
  • Enzymes especially need interactions that allow turnover, so substrate/product binding can’t be irreversibly tight.

1.8 Compendium of chemical structures 🧠🧾

This section is a “learn these structures” toolkit. The point is: if you recognize structure, you can predict behavior.

You’re encouraged to identify from structures:

  • polar vs non-polar regions
  • likely hydrogen bond donors/acceptors
  • ionic groups and pH-dependent charge states
  • nucleophiles, electrophiles, and metal-binding sites

What’s included (high level):

  • Amino acids (full side-chain structures) + general amino acid structure
  • Peptide bond structure
  • Bases and nucleotides: adenine, cytosine, guanine, thymine, uracil; ribose and 2-deoxyribose; AMP/ADP/ATP; cyclic AMP; phosphodiester linkage
  • Glycolysis intermediates: glucose, glucose-6-phosphate, fructose phosphates, triose phosphates, phosphoglycerates, PEP, pyruvate, lactate
  • Common reagents (lab relevance): EDTA, acrylamide, bis-acrylamide, SDS, Tris, urea, guanidinium chloride, DTT, 2-mercaptoethanol
  • Redox cofactors: FAD/FADH₂, NAD(P)⁺/NAD(P)H (with phosphate difference between NAD and NADP)
  • TCA cycle intermediates: citrate, isocitrate, α-ketoglutarate, succinate, fumarate, malate, oxaloacetate

Practical benefit: when you later see reagents like SDS, DTT, urea, GdmCl, you’ll immediately connect them to why they denature proteins, reduce disulfides, or enable electrophoresis.

Quiz

Score: 0/30 (0%)