(Fun, detailed, and beginner-friendly walkthrough)
This lecture moves from sequence bioinformatics into the world of structure bioinformatics β and thatβs where things get very interesting. Instead of comparing strings of amino acids, we compare 3D shapes of proteins.
Iβll go through everything step-by-step and explain the logic behind it clearly.
You may hear these terms used interchangeably β but they are not the same.
Importantly:
So:
In PyMOL, two commands are commonly used:
Example from the lecture: Two proteins (CPZ and CC8) both had:
After alignment:
RMSD = Root Mean Square Deviation It measures the average distance between aligned atoms.
Important:
PyMOL did NOT align sequences β it aligned atomic coordinates
This is structural comparison, not sequence comparison.
The lecture emphasizes something crucial:
BUT NOT THE OTHER WAY AROUND.
Why?
DNA variability > Protein sequence variability > Protein structure variability
This is extremely important for understanding why structure classification works.
Two important evolutionary insights:
Rare but possible
Example: serine proteases Different folds, same catalytic activity Likely convergent evolution
This tells us: Structure and function are often linked β but not guaranteed.
Because:
Two major databases attempt this:
CATH stands for:
Four main classes:
Describes overall arrangement of secondary structures.
Examples in Ξ±/Ξ² class:
Architecture = overall 3D arrangement (Not sequence order yet)
Topology = π The path the polypeptide chain takes through the structure
It depends on:
Two proteins can:
Because the order in sequence differs
Topology β spatial arrangement only Topology = connectivity pattern
Groups proteins with:
SCOP = Structural Classification of Proteins
More complex hierarchy than CATH.
Levels include:
Example: In class Ξ± and Ξ²:
Difference between Ξ±/Ξ² and Ξ±+Ξ² is not always obvious
Important: Both CATH and SCOP classify domains, not entire proteins
A structural and functional unit within a protein.
Even more confusing:
This makes domain definition:
CATH:
SCOP:
That introduces operator bias.
Suppose you:
How do you know if similar structures exist?
π Use the DALI server
Process:
Example: Uploaded small copper-binding protein (COPSET)
Returned hits:
Conclusion: Proteins with similar structure often share similar function
You can lose sequence similarity and still retain fold.
Atomic coordinates are compared.
It is partly subjective and complex.
It finds structural neighbors.