PURPOSE: The principal research interest of the laboratory is protein folding -- how amino acid sequence information encodes three-dimensional structure. A combined experimental and computational approach is being taken to this longstanding puzzle of fundamental biochemistry. Several small proteins are being used as simple systems for characterizing the structure that persists in the unfolded (denatured) state, the starting point for folding both in the cell and in the test tube. In addition, the laboratory is working to predict protein structure from sequence in ways that make the underlying physical chemistry transparent and the relative contributions of different interactions quantifiable.
After more than 15 years of characterization of the denatured state of staphylococcal nuclease, we now know that it retains a native-like topology; i.e., its ensemble-averaged structure at low resolution resembles a swollen form of the native structure. Application of a new NMR parameter, the residual dipolar coupling (RDC), to nuclease and eglin C suggests that most and perhaps all proteins unfold to a dynamic state that, on average, retains a spatial positioning of residues very similar to that observed in the folded state. In collaboration with the laboratory of Dr. Joel Tolman in the Chemistry department, we are working to more precisely define this persistent long range structure in denatured nuclease, eglin C and the repeat protein N-ankyrin (5 domains).
To predict the structure of proteins, new conformations are generated to fit a protein sequence, using a hierarchical strategy that begins with fragments of length 5 to 8 residues, has fragments of length 30-50 as an intermediate state, and then assembles them to yield a full length polypeptide chain. After selection of compact conformations without overlaps that place hydrophobic residues on the interior and charged residues on the outside, all atoms of the side chain atoms are added. In the final step, a simple genetic algorithm allows for efficient search of compact conformations for the one of lowest atomic energy and lowest solvation energy.
A rather long list of statistical potentials is used a different steps to optimize all of the interactions that appear optimized in high resolution crystal structures. In order to assess our progress, every two years the laboratory enters the CASP international competition (http://predictioncenter.org/casp7/Casp7.html ) to predict protein structure blindly. In addition to ab initio or new fold prediction, we are working on improving the energy functions and sampling methods used to refine homology models. The CASP7 prediction season closed in August 2006, and the results will become available in November. The laboratory entered models for approximately 35 target proteins considered to be “new folds” and 50 target proteins that have homologues of known structure, all constructed using our cluster of 60 computer processors.
Selected Publications
Fang, Q and Shortle, D (2006) Protein refolding in silico with atom-based statistical potentials and conformational search using a simple genetic algorithm. J. Mol. Biol. 359:1456-1467.
Gebel, E. Ruan, K.; Tolman, J.R and Shortle, D. (2006) Multiple alignment tensors from a denatured protein. J. Am. Chem. Soc. 2006 J28:9310-9311.
Fang, Q. and Shortle, D. (2005) Enhanced sampling near the native conformation using statistical potentials for local side-chain and backbone interactions. Proteins. 60: 97-102.
Fang Q. and Shortle, D. (2005) A consistent set of statistical potentials for quantifying local side-chain and backbone interactions. Proteins 60: 90-96.
Ohnishi, S., Lee, A.L., Edgell, M.H. and Shortle, D. (2004) Direct demonstration of structural similarity between native and denatured eglin C. Biochemistry 43: 4064-4070.