Minimalist representations and the importance of nearest neighbor effects in protein folding simulations

In order to investigate the level of representation required to simulate folding and predict structure, we test the ability of a variety of reduced representations to identify native states in decoy libraries and to recover the native structure given the advanced knowledge of the very broad native Ramachandran basin assignments. Simplifications include the removal of the entire side-chain or the retention of only the Cβ atoms. Scoring functions are derived from an all-atom statistical potential that distinguishes between atoms and different residue types. Structures are obtained by minimizing the scoring function with a computationally rapid simulated annealing algorithm. Results are compared for simulations in which backbone conformations are sampled from a Protein Data Bank-based backbone rotamer library generated by either ignoring or including a dependence on the identity and conformation of the neighboring residues. Only when the Cβ atoms and nearest neighbor effects are included do the lowest energy structures generally fall within 4 Å of the native backbone root-mean square deviation (RMSD), despite the initial configuration being highly expanded with an average RMSD ≥ 10 Å. The side-chains are reinserted into the Cβ models with minimal steric clash. Therefore, the detailed, all-atom information lost in descending to a Cβ-level representation is recaptured to a large measure using backbone dihedral angle sampling that includes nearest neighbor effects and an appropriate scoring function.

DOI: 10.1016/j.jmb.2006.08.035