Lecture 1 Chap 1 PN Book

Protein chemistry

ChatGPT conversation

Introductory note + Aims of the chapter 🎯

The chapter is a chemistry refresher specifically targeted at understanding proteins—not a full chemistry course.
Goal: help you understand proteins as 3D molecules and the molecular forces that shape:
- protein folding (structure)
- protein interactions (binding)
- protein function
It starts from amino acids (building blocks), builds through levels of protein structure, then explains forces stabilizing structures/interactions, and ends with a compendium of key chemical structures to learn.

1.1 The different aspects of protein structure 🧩

Proteins are too complex to describe only in 2D. So we define structure at four levels:

Primary structure: the amino acid sequence (covalent structure), including post-translational modifications (PTMs) like phosphorylation or glycosylation.
Secondary structure: local folding patterns—α-helices, β-sheets, turns.
Tertiary structure: overall 3D fold of one polypeptide chain (brings far-apart residues close in space).
Quaternary structure: assembly of multiple subunits into one functional protein.

Key idea: function depends on structure at all these levels—and they’re linked.

1.2 The constituents of proteins: amino acids 🧪💧

General amino acid structure

Standard amino acid: NH₂–CαH(R)–CO₂H (Fig. 1.1).
In water near neutral pH, amino acids are mostly zwitterions:
- NH₃⁺ and COO⁻ at the same time (Fig. 1.2).
Side-chain carbon atoms are labeled β, γ, δ, ε moving away from Cα (useful for naming groups like “γ-carboxyl of Glu” or “ε-amino of Lys”).

20 common amino acids + classification by polarity 🌊🛢️

A practical way to predict protein behavior is classifying side chains by polarity (water-loving vs water-avoiding):

Non-polar (hydrophobic): Ala, Gly, Ile, Leu, Met, Phe, Pro, Trp, Val
Polar, uncharged: Asn, Cys, Gln, Ser, Thr, Tyr
Polar, charged: Arg, Asp, Glu, His, Lys

A more quantitative approach uses a hydrophobicity scale (Table 1.2), based on free energy of transfer (membrane interior → water) for residues in an α-helix:

Strongly hydrophobic examples: Phe, Met, Ile, Leu, Val
Strongly hydrophilic examples: Arg, Asp, Lys, Glu

1.2.3 General properties of amino acids

(a) Stereochemistry 🔄

Most amino acids (except Gly) are chiral (Cα has 4 different substituents).
Proteins use L-amino acids.
Two stereochemistry systems:
- D/L system based on glyceraldehyde (Fig. 1.3)
- R/S system: absolute configuration rules (Fig. 1.4)
“Almost all L amino acids are S at Cα” (with specific notes about extra chiral centers in Ile and Thr).
D-amino acids exist in nature sometimes (e.g., bacterial cell walls; special peptide synthesis systems).

(b) Ionization + pI ⚡

Typical pKa values:
- α-COOH ~ 2.2
- α-NH₃⁺ ~ 9.5
Side-chain ionization depends on residue (Table 1.1).
Isoelectric point (pI) = pH where net charge = 0.
- No ionizable side chain: pI ≈ average of α-COOH and α-NH₃⁺ pKas (example given for Gly).
- With ionizable side chain: pI is the average of the two pKas that surround the zwitterion (example given for Asp).
Important protein reality check: side-chain pKas can shift a lot inside proteins depending on the local environment (examples given: Glu in lysozyme, Tyr in dehydroquinase, Ser in chymotrypsin).

1.2.4 Chemical characteristics of amino acids (grouped by side chain chemistry)

This section explains why particular residues behave the way they do in proteins.

Aliphatic (Ala, Gly, Ile, Leu, Val)
- Chemically unreactive; key in hydrophobic interactions.
- Gly: tiny → high flexibility, important for tight turns/packing.
Aromatic (Phe, Tyr, Trp) 🌑
- Absorb UV:
  - Phe ~ 258 nm, Tyr ~ 275 nm, Trp ~ 280 nm (used in protein quantification).
- Phe: mostly hydrophobic/unreactive.
- Tyr: hydroxyl makes it more reactive, can H-bond, can be phosphorylated, can undergo electrophilic substitution (iodination → thyroxine).
- Trp: limited reactivity; N–H can H-bond.
Basic (Arg, Lys) ➕
- Usually positively charged at neutral pH → ionic interactions with Asp/Glu or phosphates.
- Lys: pKa ~10.5; neutral form (–NH₂) is a strong nucleophile.
- Arg: pKa ~12.5; guanidinium stays protonated in physiology.
Acidic (Asp, Glu) ➖
- pKa ~4 → usually negatively charged at neutral pH.
- Bind positive groups and metal ions (e.g., Ca²⁺, Zn²⁺).
Hydroxyl (Ser, Thr)
- H-bonding; very high pKa (~15) so weak nucleophiles normally.
- Can become powerful nucleophiles in special catalytic setups (e.g., serine proteases).
- Often phosphorylated (regulation).
Amides (Asn, Gln)
- Always uncharged; excellent H-bond donors/acceptors.
Sulfur-containing (Cys, Met) 🧷
- Cys: nucleophilic sulfur; pKa ~8.4 so some exists as thiolate (S⁻) near neutral pH.
  - Can form disulfide bonds (cystine) under oxidizing conditions (common in secreted/extracellular proteins).
  - Can coordinate metals (Zn²⁺/Fe²⁺) and is a target for heavy metals (Hg, Cd, Pb).
  - Can be oxidized to sulfenic/sulfinic/sulfonic forms (oxidative stress relevance).
- Met: less reactive sulfur, but forms S-adenosylmethionine (SAM/adoMet) → major methyl donor (e.g., DNA methylation); can oxidize to Met sulfoxide.
Proline 🧱
- Cyclic side chain → strong geometric constraints.
- Special behavior in peptide bonds (important later).
- Hydroxylation → 4-hydroxyproline in collagen (stability; vitamin C link via prolyl hydroxylase).
Histidine 🎚️
- Imidazole pKa ~6 → near-neutral pH can switch between protonated and neutral forms.
- Frequently in enzyme active sites: acid–base catalysis, nucleophilicity, metal binding.

1.2.5 Structure of water 💧

Water geometry: H–O–H angle ~104.5°, O–H bond length ~0.096 nm.
Partial charges create strong polarity and enable hydrogen bonding.
Each water can form up to 4 H-bonds (two donors, two acceptors), tetrahedral arrangement (Fig. 1.5).
Ice has an open network → lower density (floats).
Liquid water is dynamic (clusters; ~3.5 H-bonded neighbors on average).
Water is great for charged/polar molecules but poor for non-polar ones → sets up the hydrophobic effect later.

1.3 Primary structure of proteins 🧬

Primary structure = amino-acid sequence plus the reality of biology:

gene → mRNA (introns spliced out)
translation
PTMs (add/remove groups, proteolysis, etc.)

1.3.1 The peptide bond 🔗

Formed by condensation: carboxyl + amino → amide (peptide) bond + H₂O (Fig. 1.6/1.7).
Chains are written N-terminus → C-terminus.
Peptide bond has partial double-bond character due to resonance (Fig. 1.8):
- makes the peptide unit planar
- creates a permanent dipole (partial charges on O/C/N/H)
Cis/trans isomerism (Fig. 1.9):
- trans usually much more stable than cis (steric clash in cis).
- Exception: bonds involving proline have a much smaller trans preference → ~5% of Xaa–Pro bonds can be cis.
Rotation around peptide bond has a high barrier → proline isomerization can be a slow step in folding, helped by peptidyl–prolyl isomerase.

1.3.2 What you can learn from sequence information 📏🔍

Exact molecular mass
- Sum amino-acid masses (Table 1.1) + H₂O; compare to mass spectrometry for identity/integrity.
Isoelectric point (pI)
- Determined by counts of charged groups (N-terminus, C-terminus, Asp/Glu vs Lys/Arg/His).
- Predicted pI is approximate because pKas shift in proteins, but still useful (e.g., predicting behavior in electrophoresis/ion-exchange).
Absorption coefficient at 280 nm (extinction coefficient)
- Mostly from Tyr and Trp (plus small disulfide contribution).
- Used to estimate protein concentration.
Hydrophobicity + aliphatic index
- Overall polar/non-polar balance can hint at solubility or membrane association.
- Aliphatic index tracks Ala/Ile/Leu/Val abundance.
Post-translational modifications (Table 1.3) Key PTMs and functional consequences:
- Proteolysis: targeting sequence removal, hormone generation, enzyme activation
- Disulfides: stabilize secreted proteins
- Hydroxylation: stabilizes collagen triple helix
- Glycosylation: cell recognition, anchoring via GPI, increased polarity
- Phosphorylation (Ser/Thr/Tyr): regulation/signaling
- N-terminal acylation: myristoylation/palmitoylation → membrane association Mass spectrometry is highlighted as a way to detect PTMs by mass differences (example: lysozyme disulfides causing a predictable mass shift).
Structural and functional motifs 🧷 Sequence patterns can predict:
- Transmembrane helices via hydropathy plots (threshold example: +84 kJ/mol for a 20-aa stretch).
- Targeting/retention sequences (peroxisome, ER retention signals).
- Metal-binding motifs (example consensus for 2Fe–2S clusters).
- Kinase phosphorylation consensus sequences.
- N-glycosylation motif: Asn–X–Ser/Thr (X rarely Pro or Asp).
Sequence relationships 🌳
- Comparing homologs across species reveals conserved residues (often functional/structural essentials).
- Gene duplication + divergence explains related protein families.
- Domain architecture can be inferred (example: steroid hormone receptors with distinct domains).

1.4 Secondary structure of proteins 🌀📄

Because the peptide bond is planar, chain flexibility mainly comes from rotation around the bonds at Cα, defined by dihedral angles:

φ (phi): N–Cα bond rotation
ψ (psi): Cα–C bond rotation
Ramachandran plot shows allowed combinations due to steric constraints (Fig. 1.12).
Glycine has more allowed space (tiny side chain); bulky residues have less.

1.4.1 α-helix 🌀

Right-handed helix for L-amino acids.
Hydrogen bonds: carbonyl of residue n bonds to N–H of residue n+4 (shown in Fig. 1.13).
Ideal geometry:
- 3.6 residues/turn
- pitch 0.54 nm
- φ ≈ −57°, ψ ≈ −47°
Has a helix dipole (N-terminus +, C-terminus −) that can influence interactions.
Proline disrupts helices (can break/kink) because of its rigid ring geometry.
Helices in proteins are often short (~12 residues) and slightly distorted from ideal.
Helical wheel projection (Fig. 1.14) helps visualize amphipathicity:
- One face polar, one face non-polar → helix can sit at membranes.
- A ~21–22 aa helix spans a lipid bilayer thickness (~3.2 nm).

Other helices:

3₁₀ helix: tighter; H-bond n → n+3; often near helix ends.
π helix: looser; H-bond n → n+5; rare.

1.4.2 β-sheets 📄

Built from extended β-strands; side chains alternate above/below sheet.
Two arrangements:
- Parallel
- Anti-parallel (more common; needs fewer strands; H-bonds less distorted) (Fig. 1.15)
Strand length: ~6 residues typical in proteins.
β-sheets have a right-handed twist (electrostatics between carbonyls contributes).
Membrane proteins can also be β-structured (porins example: 16 β-strands forming a pore).

1.4.3 Other structural features: β-turns + polyproline 🔁

β-turns allow compact globular folding (chain reverses ~180°).
Many types; common ones are 4 residues long (Fig. 1.16).
Proline often found in turns or at helix/sheet ends.
Polyproline structures:
- Poly-Pro I: all cis; right-handed helix
- Poly-Pro II: all trans; extended left-handed helix
- Poly-Pro II segments show up in real proteins more than you might expect.

1.4.4 Amino acid preferences for secondary structure 📊

From real protein structures and model polymers:

Helix-friendly: Met, Glu, Leu, Ala
Helix-unfriendly: Pro, Gly, Tyr
Sheet-friendly: Val, Ile, Phe
Sheet-unfriendly: Pro, Asp
Turn-friendly: Pro, Gly, Asp
Turn-unfriendly: Met, Val, Ile Secondary structure prediction accuracy is useful but limited (~70%); helices are easiest to predict.

1.5 Tertiary structure of proteins 🏗️

Tertiary structure = long-range folding of one polypeptide chain.

1.5.1 General principles (common themes across many proteins)

Close packing: interiors are tightly packed with few cavities; cavities often contain water.
Secondary structure elements persist, but with distortions (e.g., α-helices may shift toward 3₁₀-like geometry).
Side-chain distribution:
- Hydrophobic residues mostly buried
- Polar residues mostly surface-exposed
- Exception: membrane-spanning regions invert this logic.
Internal polar groups must be satisfied:
- hydrogen bonds to other groups or internal waters.
Domains:
- Large proteins often consist of domains (~100–150 aa), independently folded modules linked to functions (binding/catalysis).
- Domains can sometimes be isolated by proteolysis or expressing part of the gene.

1.5.2 Classification of protein structures 🗂️

Major structural classes:

all-α
all-β
αβ (sometimes split into α/β vs α+β depending on mixing pattern along the chain)

SCOP database is highlighted as a classification tool:

folds (overall arrangement)
superfamilies (probable relationships)
families (strong evidence; often >30% sequence identity)

Estimates and big idea:

~1000 folds exist.
A small set of “superfolds” account for a large chunk of structures (>30%).
Examples shown (Fig. 1.17):
- TIM barrel
- α/β doubly wound
- Greek key

1.5.3 What stabilizes tertiary structure?

Mostly weak non-covalent forces, discussed in 1.7 (disulfides are a special covalent stabilizer in secreted proteins).

1.6 Quaternary structure of proteins 🤝

Quaternary structure = assembly of multiple polypeptide chains (subunits).

General tendencies:

< ~30 kDa often monomeric
~50 kDa often oligomeric (rule-of-thumb, not absolute)

Types:

Homooligomer: same subunit repeated (e.g., homotetramer α₄)
Heterooligomer: different subunits (e.g., αβγ)

Why oligomerize?

Regulation/communication between subunits (classic example: cooperative O₂ binding).
Increased stability (interfaces stabilize folding).
New structures with cavities:
- chaperones (folding chambers)
- proteasomes (degradation chambers) Some assemblies can include disulfide links (e.g., some immunoglobulins).

1.7 Forces contributing to protein structure and interactions ⚖️✨

Proteins are covalently linked by peptide bonds, but folding and binding are dominated by weak, non-covalent forces.

1.7.1 Ionic (electrostatic) interactions ➕➖

Between opposite charges (Asp/Glu vs Lys/Arg/His).
Energy depends on distance r and dielectric constant D:
- E = q1q2 / (D r)
Water has high D (~78.5) → surface ionic interactions are often weak (~5 kJ/mol).
Buried ionic interactions in low-D environments can be much stronger (up to ~20 kJ/mol).
Protein interiors have variable effective D depending on water penetration (values discussed).

1.7.2 Hydrogen bonds 🧷

Interaction of H attached to electronegative atom (donor) with another electronegative atom (acceptor).
Typical strength ~5–10 kJ/mol.
Key feature: directionality/geometry → provides specificity.
Ideal geometry: A–H–B nearly linear; A–B distance ~0.30 ± 0.05 nm.
Important note: H-bonds can be “competed away” by water—so net stabilization depends on context.

1.7.3 van der Waals interactions 🌫️

Includes dipole–dipole, dipole–induced dipole, induced–induced dipole.
Individually weak (~5 kJ/mol), but numerous → collectively significant.
Reflects a balance of attraction at longer range and repulsion at short range.

1.7.4 Hydrophobic interactions (hydrophobic effect) 🛢️➡️💧

Water forms ordered cages around non-polar groups → entropy penalty.
Non-polar groups tend to cluster to reduce exposed surface → drives folding.
These interactions contribute the largest energy component stabilizing folded proteins in water.
Temperature effect: can weaken at low temperature → explains cold denaturation in some proteins.
Agents affecting it:
- Urea / guanidinium chloride weaken hydrophobic interactions (disrupt water H-bond network).
- Some salts strengthen hydrophobic interactions (Hofmeister series listed for ions).

1.7.5 Balance of energy contributions 🎭

Protein stability is a delicate thermodynamic balance:

Folding:
- big unfavorable entropy decrease (chain becomes ordered)
- big favorable enthalpy decrease (many internal interactions form)
Net stability is surprisingly small: typically 20–60 kJ/mol, even though the underlying terms are each hundreds of kJ/mol.
Proteins can be unfolded by:
- denaturants (urea, GdmCl)
- extreme pH (changes ionization)
- temperature shifts (changes interaction balance)

1.7.6 Range of binding strengths (Kd) 🎯

Different biological functions require different binding tightness. A table gives typical Kd ranges and corresponding free energies (examples):

Very tight: avidin–biotin (~10⁻¹⁵ M)
Protein–protein (~10⁻¹⁰ M)
Antibody–antigen (~10⁻⁹ M)
Receptor–hormone (~10⁻⁷ M)
Enzyme–substrate (~10⁻⁵ M)

Important interpretation:

Strong enough for biological action, but not always “too strong.”
Enzymes especially need interactions that allow turnover, so substrate/product binding can’t be irreversibly tight.

1.8 Compendium of chemical structures 🧠🧾

This section is a “learn these structures” toolkit. The point is: if you recognize structure, you can predict behavior.

You’re encouraged to identify from structures:

polar vs non-polar regions
likely hydrogen bond donors/acceptors
ionic groups and pH-dependent charge states
nucleophiles, electrophiles, and metal-binding sites

What’s included (high level):

Amino acids (full side-chain structures) + general amino acid structure
Peptide bond structure
Bases and nucleotides: adenine, cytosine, guanine, thymine, uracil; ribose and 2-deoxyribose; AMP/ADP/ATP; cyclic AMP; phosphodiester linkage
Glycolysis intermediates: glucose, glucose-6-phosphate, fructose phosphates, triose phosphates, phosphoglycerates, PEP, pyruvate, lactate
Common reagents (lab relevance): EDTA, acrylamide, bis-acrylamide, SDS, Tris, urea, guanidinium chloride, DTT, 2-mercaptoethanol
Redox cofactors: FAD/FADH₂, NAD(P)⁺/NAD(P)H (with phosphate difference between NAD and NADP)
TCA cycle intermediates: citrate, isocitrate, α-ketoglutarate, succinate, fumarate, malate, oxaloacetate

Practical benefit: when you later see reagents like SDS, DTT, urea, GdmCl, you’ll immediately connect them to why they denature proteins, reduce disulfides, or enable electrophoresis.

Quiz

Score: 0/30 (0%)

Q0. Which level of protein structure refers specifically to the amino acid sequence, including post-translational modifications?

Secondary structure

Primary structure

Tertiary structure

Quaternary structure

Q1. What property of the peptide bond makes it largely planar?

Hydrogen bonding

Steric hindrance

Partial double-bond character

Electrostatic attraction

Q2. Which amino acid is achiral and therefore has greater conformational flexibility in proteins?

Alanine

Glycine

Valine

Proline

Q3. Which interaction is the dominant driving force for protein folding in aqueous environments?

Ionic interactions

Hydrogen bonds

van der Waals forces

Hydrophobic effect

Q4. What is the approximate number of residues per turn in an ideal α-helix?

2.0

3.0

3.6

4.2

Q5. Which amino acid most commonly disrupts α-helical structure due to its rigid cyclic side chain?

Glycine

Alanine

Proline

Leucine

Q6. Why are anti-parallel β-sheets generally more stable than parallel β-sheets?

They contain more hydrophobic residues

Their hydrogen bonds are more linear

They have fewer strands

They are always buried in protein cores

Q7. What does the isoelectric point (pI) of a protein represent?

The pH at which the protein is most soluble

The pH at which the protein has no net charge

The pH at which the protein denatures

The pH at which all side chains are neutral

Q8. Which amino acid side chain has a pKa close to physiological pH, making it ideal for acid–base catalysis?

Aspartate

Glutamate

Histidine

Lysine

Q9. What structural feature distinguishes quaternary structure from tertiary structure?

Presence of α-helices

Overall 3D fold

Association of multiple polypeptide chains

Hydrophobic core formation

Q10. Which non-covalent interaction is most sensitive to the dielectric constant of its environment?

Hydrogen bonding

Hydrophobic interactions

van der Waals forces

Ionic interactions

Q11. Why can internal hydrogen bonds in proteins contribute to stability despite water competing for hydrogen bonding?

They are always stronger than water hydrogen bonds

They reduce unsatisfied polar groups in the protein interior

They increase protein entropy

They eliminate electrostatic interactions

Q12. Which post-translational modification commonly stabilizes extracellular proteins?

Phosphorylation

Acetylation

Disulfide bond formation

Methylation

Q13. What information can hydropathy plots provide about a protein sequence?

Exact folding pathway

Locations of catalytic residues

Potential transmembrane segments

Subunit stoichiometry

Q14. Why is the overall free energy change for protein folding relatively small compared to individual interaction energies?

Proteins are unstable molecules

Enthalpy and entropy changes largely cancel

Only covalent bonds contribute significantly

Folding occurs without energy barriers

Q15. All amino acids found in proteins are chiral.

True

False

Q16. The peptide bond has free rotation similar to a single bond.

True

False

Q17. Hydrophobic amino acid side chains are typically found buried in the interior of globular proteins.

True

False

Q18. Anti-parallel β-sheets are less common in proteins than parallel β-sheets.

True

False

Q19. Proline residues can exist in both cis and trans peptide bond conformations in proteins.

True

False

Q20. The hydrophobic effect is primarily enthalpy-driven.

True

False

Q21. Hydrogen bonds in proteins are strongest when donor, hydrogen, and acceptor atoms are arranged linearly.

True

False

Q22. All proteins larger than 50 kDa must have quaternary structure.

True

False

Q23. Histidine is frequently found in enzyme active sites due to its ability to donate and accept protons.

True

False

Q24. van der Waals interactions are individually weak but collectively significant in protein stability.

True

False

Q25. Disulfide bonds commonly form in the reducing environment of the cytosol.

True

False

Q26. Protein folding usually results in a large negative change in entropy for the polypeptide chain.

True

False

Q27. Hydrogen bonds alone are sufficient to drive protein folding in water.

True

False

Q28. The extinction coefficient at 280 nm mainly depends on the presence of tryptophan and tyrosine residues.

True

False

Q29. Domains within a protein often correspond to distinct functional or structural units.

True

False