Today, we usually determine protein sequence from:
But this does not reveal:
When DNA info is missing or incomplete β we must determine sequence experimentally.
Large proteins are too big to sequence directly. So we:
Each protease cuts at specific residues:
| Enzyme | Cleaves |
|---|---|
| Trypsin | After Lys (K) & Arg (R) |
| Chymotrypsin | After Phe (F), Trp (W), Tyr (Y) |
| V8 protease | After Asp (D), Glu (E) |
| Asp-N | Before Asp |
| Thermolysin | Before large hydrophobic residues |
Also chemical cleavage:
These different cleavage patterns are crucial for determining sequence order.
You asked:
Edman only sequences few AA because labeling is not 100% degraded?
Yes β correct, and here is the full explanation:
Each cycle is ~99% efficient (example).
Cycle 1:
Cycle 2:
Cycle 3:
After many rounds:
So we can only reliably sequence ~10 residues.
This is cumulative inefficiency.
You asked:
To know the order, do we use trypsin & chymotrypsin?
Yes β this is essential logic.
If we digest only with trypsin:
Example fragments:
A - B - C - K D - E - F - R G - H - I - K
We donβt know their order in full protein.
Now digest same protein with chymotrypsin:
Different fragments:
C - K - D F - R - G I - K - J
Now we find overlaps:
This is called overlapping peptide mapping.
Without two different digestions β cannot reconstruct full order.
This applies to BOTH:
Modern method of choice.
Two soft ionization methods:
Matrix Assisted Laser Desorption Ionization Usually produces:
Sprays protein into charged droplets.
Produces:
Because:
So ESI gives a charge distribution spectrum.
Measure mass of intact peptide.
Peptide fragmented.
Measure masses of fragments.
Fragmentation happens at peptide backbone.
Main ions:
| Fragment Type | Breaks at | Name |
|---|---|---|
| b-ion | N-terminal fragment | |
| y-ion | C-terminal fragment | |
| c-ion | N-side different bond | |
| z+1-ion | C-side different bond |
Most common in CID:
If peptide is:
AβBβCβDβE
You get:
Differences between peaks correspond to one amino acid mass.
So sequence is deduced by: Mass difference between consecutive fragments.
You asked:
MS can't determine lysine and isoleucine because masses are same?
Correction:
It is Leucine (L) and Isoleucine (I) that have identical mass.
Both:
Mass spectrometry cannot distinguish them by mass alone.
However:
But classical MS β cannot distinguish L and I.
Smaller proteins β move faster.
Used for 1β200 kDa.
Most accurate molecular weight method.
Can detect:
You asked:
Smaller molecules have longer travel time? Why?
Yes β and this is correct.
Column contains porous beads.
Large proteins:
Small proteins:
So:
| Size | Elution Time |
|---|---|
| Large | Short |
| Small | Long |
Separation based on hydrodynamic radius.
| Method | Accuracy | Size Range | Notes |
|---|---|---|---|
| SDS-PAGE | Moderate | 1β200 kDa | Very common |
| SEC | Lower accuracy | Up to 10,000 kDa | Good for complexes |
| Ultracentrifugation | Good | Large complexes | Rarely used today |
| Mass Spectrometry | Very high | Depends | Best for exact MW |
Mass spectrometry can detect:
Because: Fragment masses shift according to modification.
DNA sequencing cannot detect PTMs.
To determine protein sequence experimentally:
β Edman limited due to cumulative inefficiency β Two proteases needed for overlap mapping β Tandem MS determines sequence via fragment mass differences β MS cannot distinguish Leu/Ile by mass β b, y, c, z+1 ions result from backbone cleavage β ESI produces multiple charge states due to droplet protonation β SEC: small molecules take longer because they enter pores