Replica-exchange molecular dynamics (REMD) and their variants have been widely used in simulations of the biomolecular structure and dynamics. Replica exchange with solute tempering (REST) is one of the methods where temperature of a pre-defined solute molecule is exchanged between replicas, while solvent temperatures in all the replicas are kept constant. REST greatly reduces the number of replicas compared to the temperature REMD, while replicas at low temperatures are often trapped under their conditions, interfering with the conformational sampling. Here, we introduce a new scheme of REST, referred to as generalized REST (gREST), where the solute region is defined as a part of a molecule or a part of the potential energy terms, such as the dihedral-angle energy term or Lennard-Jones energy term. We applied this new method to folding simulations of a β-hairpin (16 residues) and a Trp-cage (20 residues) in explicit water. The protein dihedral-angle energy term is chosen as the solute region in the simulations. gREST reduces the number of replicas necessary for good random walks in the solute-temperature space and covers a wider conformational space compared to the conventional REST2. Considering the general applicability, gREST should become a promising tool for the simulations of protein folding, conformational dynamics, and an in silico drug design.
Single-molecule experiments and molecular dynamics (MD) simulations are indispensable tools for investigating protein conformational dynamics. The former provide timeseries data, such as donor-acceptor distances, whereas the latter give atomistic information, although this information is often biased by model parameters. Here, we devise a machine-learning method to combine the complementary information from the two approaches and construct a consistent model of conformational dynamics. It is applied to the folding dynamics of the formin-binding protein WW domain. MD simulations over 400 μs led to an initial Markov state model (MSM), which was then "refined" using single-molecule Förster resonance energy transfer (FRET) data through hidden Markov modeling. The refined or data-assimilated MSM reproduces the FRET data and features hairpin one in the transition-state ensemble, consistent with mutation experiments. The folding pathway in the data-assimilated MSM suggests interplay between hydrophobic contacts and turn formation. Our method provides a general framework for investigating conformational transitions in other proteins.
A data-driven modeling scheme is proposed for conformational dynamics of biomolecules based on molecular dynamics (MD) simulations and experimental measurements. In this scheme, an initial Markov State Model (MSM) is constructed from MD simulation trajectories, and then, the MSM parameters are refined using experimental measurements through machine learning techniques. The second step can reduce the bias of MD simulation results due to inaccurate force-field parameters. Either time-series trajectories or ensemble-averaged data are available as a training data set in the scheme. Using a coarse-grained model of a dye-labeled polyproline-20, we compare the performance of machine learning estimations from the two types of training data sets. Machine learning from time-series data could provide the equilibrium populations of conformational states as well as their transition probabilities. It estimates hidden conformational states in more robustways compared to that from ensemble-averaged data although there are limitations in estimating the transition probabilities between minor states. We discuss how to use the machine learning scheme for various experimental measurements including single-molecule time-series trajectories.
In molecular dynamics (MD) simulations, a proper definition of kinetic energy is essential for controlling pressure as well as temperature in the isothermal-isobaric condition. The virial theorem provides an equation that connects the average kinetic energy with the product of particle coordinate and force. In this paper, we show that the theorem is satisfied in MD simulations with a larger time step and holonomic constraints of bonds, only when a proper definition of kinetic energy is used. We provide a novel definition of kinetic energy, which is calculated from velocities at the half-time steps (t − Δt/2 and t + Δt/2) in the velocity Verlet integration method. MD simulations of a 1,2-dispalmitoyl-sn-phosphatidylcholine (DPPC) lipid bilayer and a water box using the kinetic energy definition could reproduce the physical properties in the isothermal-isobaric condition properly. We also develop a multiple time step (MTS) integration scheme with the kinetic energy definition. MD simulations with the MTS integration for the DPPC and water box systems provided the same quantities as the velocity Verlet integration method, even when the thermostat and barostat are updated less frequently.
The multidrug transporter AcrB transports a broad range of drugs out of the cell by means of the proton-motive force. The asymmetric crystal structure of trimeric AcrB suggests a functionally rotating mechanism for drug transport. Despite various supportive forms of evidence from biochemical and simulation studies for this mechanism, the link between the functional rotation and proton translocation across the membrane remains elusive. Here, calculating the minimum free energy pathway of the functional rotation for the complete AcrB trimer, we describe the structural and energetic basis behind the coupling between the functional rotation and the proton translocation at atomic resolution. Free energy calculations show that protonation of Asp408 in the transmembrane portion of the drug-bound protomer drives the functional rotation. The conformational pathway identifies vertical shear motions among several transmembrane helices, which regulate alternate access of water in the transmembrane as well as peristaltic motions that pump drugs in the periplasm.
GENeralized-Ensemble SImulation System (GENESIS) is a software package for molecular dynamics (MD) simulation of biological systems. It is designed to extend limitations in system size and accessible time scale by adopting highly parallelized schemes and enhanced conformational sampling algorithms. In this new version, GENESIS 1.1, new functions and advanced algorithms have been added. The all-atom and coarse-grained potential energy functions used in AMBER and GROMACS packages now become available in addition to CHARMM energy functions. The performance of MD simulations has been greatly improved by further optimization, multiple time-step integration, and hybrid (CPU + GPU) computing. The string method and replica-exchange umbrella sampling with flexible collective variable choice are used for finding the minimum free-energy pathway and obtaining free-energy profiles for conformational changes of a macromolecule. These new features increase the usefulness and power of GENESIS for modeling and simulation in biological research. © 2017 Wiley Periodicals, Inc..
Flexible fitting is a computational algorithm to derive a new conformational model that conforms to low-resolution experimental data by transforming a known structure. A common application is against data from cryo-electron microscopy to obtain conformational models in new functional states. The conventional flexible fitting algorithms cannot derive correct structures in some cases due to the complexity of conformational transitions. In this study, we show the importance of conformational ensemble in the refinement process by performing multiple fittings trials using a variety of different force constants. Application to simulated maps of Ca2+ ATPase and diphtheria toxin as well as experimental data of release factor 2 revealed that for these systems, multiple conformations with similar agreement with the density map exist and a large number of fitting trials are necessary to generate good models. Clustering analysis can be an effective approach to avoid over-fitting models. In addition, we show that an automatic adjustment of the biasing force constants during the fitting process, implemented as replica-exchange scheme, can improve the success rate.
Parallelization of molecular dynamics (MD) simulation is essential for investigating conformational dynamics of large biological systems, such as ribosomes, viruses, and multiple proteins in cellular environments. To improve efficiency in the parallel computation, we have to reduce the amount of data transfer between processors by introducing domain decomposition schemes. Also, it is important to optimize the computational balance between real-space non-bonded interactions and reciprocal-space interactions for long-range electrostatic interactions. Here, we introduce a novel parallelization scheme for large-scale MD simulations on massively parallel supercomputers consisting of only CPUs. We make use of a multiple program/multiple data (MPMD) approach for separating the real-space and reciprocal-space computations on different processors. We also utilize the r-RESPA multiple time step integrator on the framework of the MPMD approach in an efficient way: when the reciprocal-space computations are skipped in r-RESPA, processors assigned for them are utilized for half of the real-space computations. The new scheme allows us to use twice as many as processors that are available in the conventional single program approach. The best performances of all-atom MD simulations for 1 million (STMV), 8.5 million (8_STMV), and 28.8 million (27_STMV) atom systems on K computer are 65, 36, and 24 ns/day, respectively. The MPMD scheme can accelerate 23.4, 10.2, and 9.2 ns/day from the maximum performance of single-program approach for STMV, 8_STMV, and 27_STMV systems, respectively, which correspond to 57%, 39%, and 60% speed up. This suggests significant speedups by increasing the number of processors without losing parallel computational efficiency.
Biological macromolecules function in highly crowded cellular environments. The structure and dynamics of proteins and nucleic acids are well characterized in vitro, but in vivo crowding effects remain unclear. Using molecular dynamics simulations of a comprehensive atomistic model cytoplasm we found that protein-protein interactions may destabilize native protein structures, whereas metabolite interactions may induce more compact states due to electrostatic screening. Protein-protein interactions also resulted in significant variations in reduced macromolecular diffusion under crowded conditions, while metabolites exhibited significant two-dimensional surface diffusion and altered protein-ligand binding that may reduce the effective concentration of metabolites and ligands in vivo. Metabolic enzymes showed weak non-specific association in cellular environments attributed to solvation and entropic effects. These effects are expected to have broad implications for the in vivo functioning of biomolecules. This work is a first step towards physically realistic in silico whole-cell models that connect molecular with cellular biology.
The graphics processing unit (GPU) has become a popular computational platform for molecular dynamics (MD) simulations of biomolecules. A significant speedup in the simulations of small- or medium-size systems using only a few computer nodes with a single or multiple GPUs has been reported. Because of GPU memory limitation and slow communication between GPUs on different computer nodes, it is not straightforward to accelerate MD simulations of large biological systems that contain a few million or more atoms on massively parallel supercomputers with GPUs. In this study, we develop a new scheme in our MD software, GENESIS, to reduce the total computational time on such computers. Computationally intensive real-space nonbonded interactions are computed mainly on GPUs in the scheme, while less intensive bonded interactions and communication-intensive reciprocal-space interactions are performed on CPUs. On the basis of the midpoint cell method as a domain decomposition scheme, we invent the single particle interaction list for reducing the GPU memory usage. Since total computational time is limited by the reciprocal-space computation, we utilize the RESPA multiple time-step integration and reduce the CPU resting time by assigning a subset of nonbonded interactions on CPUs as well as on GPUs when the reciprocal-space computation is skipped. We validated our GPU implementations in GENESIS on BPTI and a membrane protein, porin, by MD simulations and an alanine-tripeptide by REMD simulations. Benchmark calculations on TSUBAME supercomputer showed that an MD simulation of a million atoms system was scalable up to 256 computer nodes with GPUs.
Collective variables (CVs) are often used in molecular dynamics simulations based on enhanced sampling algorithms to investigate large conformational changes of a protein. The choice of CVs in these simulations is essential because it affects simulation results and impacts the free-energy profile, the minimum free-energy pathway (MFEP), and the transition-state structure. Here we examine how many CVs are required to capture the correct transition-state structure during the open-to-close motion of adenylate kinase using a coarse-grained model in the mean forces string method to search the MFEP. Various numbers of large amplitude principal components are tested as CVs in the simulations. The incorporation of local coordinates into CVs, which is possible in higher dimensional CV spaces, is important for capturing a reliable MFEP. The Bayesian measure proposed by Best and Hummer is sensitive to the choice of CVs, showing sharp peaks when the transition-state structure is captured. We thus evaluate the required number of CVs needed in enhanced sampling simulations for describing protein conformational changes.
Three-dimensional Fast Fourier Transform (3D FFT) plays an important role in a wide variety of computer simulations and data analyses, including molecular dynamics (MD) simulations. In this study, we develop hybrid (MPI+OpenMP) parallelization schemes of 3D FFT based on two new volumetric decompositions, mainly for the particle mesh Ewald (PME) calculation in MD simulations. In one scheme, (1d_Alltoall), five all-to-all communications in one dimension are carried out, and in the other, (2d_Alltoall), one two-dimensional all-to-all communication is combined with two all-to-all communications in one dimension. 2d_Alltoall is similar to the conventional volumetric decomposition scheme. We performed benchmark tests of 3D FFT for the systems with different grid sizes using a large number of processors on the K computer in RIKEN AICS. The two schemes show comparable performances, and are better than existing 3D FFTs. The performances of 1d_Alltoall and 2d_Alltoall depend on the supercomputer network system and number of processors in each dimension. There is enough leeway for users to optimize performance for their conditions. In the PME method, short-range real-space interactions as well as long-range reciprocal-space interactions are calculated. Our volumetric decomposition schemes are particularly useful when used in conjunction with the recently developed midpoint cell method for short-range interactions, due to the same decompositions of real and reciprocal spaces. The 1d_Alltoall scheme of 3D FFT takes 4.7 ms to simulate one MD cycle for a virus system containing more than 1 million atoms using 32,768 cores on the K computer.
Large conformational changes of multidomain proteins are difficult to simulate using all-atom molecular dynamics (MD) due to the slow time scale. We show that a simple modification of the structure-based coarse-grained (CG) model enables a stable and efficient MD simulation of those proteins. “Motion Tree”, a tree diagram that describes conformational changes between two structures in a protein, provides information on rigid structural units (domains) and the magnitudes of domain motions. In our new CG model, which we call the DoME (domain motion enhanced) model, interdomain interactions are defined as being inversely proportional to the magnitude of the domain motions in the diagram, whereas intradomain interactions are kept constant. We applied the DoME model in combination with the Go model to simulations of adenylate kinase (AdK). The results of the DoME–Go simulation are consistent with an all-atom MD simulation for 10 μs as well as known experimental data. Unlike the conventional Go model, the DoME–Go model yields stable simulation trajectories against temperature changes and conformational transitions are easily sampled despite domain rigidity. Evidently, identification of domains and their interfaces is useful approach for CG modeling of multidomain proteins.
Data assimilation is a statistical method designed to improve the quality of numerical simulations in combination with real observations. Here, we develop a sequential data assimilation method that incorporates one-dimensional time-series data of smFRET (single-molecule Förster resonance energy transfer) photon-counting into conformational ensembles of biomolecules derived from “replicated” molecular dynamics (MD) simulations. A particle filter using a large number of “replicated” MD simulations with a likelihood function for smFRET photon-counting data is employed to screen the conformational ensembles that match the experimental data. We examine the performance of the method using emulated smFRET data and coarse-grained (CG) MD simulations of a dye-labeled polyproline-20. The method estimates the dynamics of the end-to-end distance from smFRET data as well as revealing that of latent conformational variables. The particle filter is also able to correct model parameter dependence in CG MD simulations. We discuss the applicability of the method to real experimental data for conformational dynamics of biomolecules.
Metadynamics (MTD) is a powerful enhanced sampling method for systems with rugged energy landscapes. It constructs a bias potential in a predefined collective variable (CV) space to overcome barriers between metastable states. In bias-exchange MTD (BE-MTD), multiple replicas approximate the CV space by exchanging bias potentials (replica conditions) with the Metropolis–Hastings (MH) algorithm. We demonstrate that the replica-exchange rates and the convergence of free energy estimates of BE-MTD are improved by introducing the infinite swapping (IS) or the Suwa-Todo (ST) algorithms. Conceptually, IS and ST perform transitions in a replica state space rather than exchanges in a replica condition space. To emphasize this, the proposed scheme is called the replica state exchange MTD (RSE-MTD). Benchmarks were performed with alanine polypeptides in vacuum and water. For the systems tested in this work, there is no significant performance difference between IS and ST.
GENESIS (Generalized‐Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all‐atom force‐field models as well as coarse‐grained Go‐like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large‐scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T‐REMD, REUS, multi‐dimensional REMD for both all‐atom and Go‐like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three‐dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real‐space and reciprocal‐space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models.
Sarco(endo)plasmic reticulum Ca2+-ATPase transports two Ca2+ per ATP-hydrolyzed across biological membranes against a large concentration gradient by undergoing large conformational changes. Structural studies with X-ray crystallography revealed functional roles of coupled motions between the cytoplasmic domains and the transmembrane helices in individual reaction steps. Here, we employed “Motion Tree (MT),” a tree diagram that describes a conformational change between two structures, and applied it to representative Ca2+-ATPase structures. MT provides information of coupled rigid-body motions of the ATPase in individual reaction steps. Fourteen rigid structural units, “common rigid domains (CRDs)” are identified from seven MTs throughout the whole enzymatic reaction cycle. CRDs likely act as not only the structural units, but also the functional units. Some of the functional importance has been newly revealed by the analysis. Stability of each CRD is examined on the morphing trajectories that cover seven conformational transitions. We confirmed that the large conformational changes are realized by the motions only in the flexible regions that connect CRDs. The Ca2+-ATPase efficiently utilizes its intrinsic flexibility and rigidity to response different switches like ligand binding/dissociation or ATP hydrolysis. The analysis detects functional motions without extensive biological knowledge of experts, suggesting its general applicability to domain movements in other membrane proteins to deepen the understanding of protein structure and function.
Adenosine triphosphate (ATP) is an indispensable energy source in cells. In a wide variety of biological phenomena like glycolysis, muscle contraction/relaxation, and active ion transport, chemical energy released from ATP hydrolysis is converted to mechanical forces to bring about large-scale conformational changes in proteins. Investigation of structure–function relationships in these proteins by molecular dynamics (MD) simulations requires modeling of ATP in solution and ATP bound to proteins with accurate force-field parameters. In this study, we derived new force-field parameters for the triphosphate moiety of ATP based on the high-precision quantum calculations of methyl triphosphate. We tested our new parameters on membrane-embedded sarcoplasmic reticulum Ca2+-ATPase and four soluble proteins. The ATP-bound structure of Ca2+-ATPase remains stable during MD simulations, contrary to the outcome in shorter simulations using original parameters. Similar results were obtained with the four ATP-bound soluble proteins. The new force-field parameters were also tested by investigating the range of conformations sampled during replica-exchange MD simulations of ATP in explicit water. Modified parameters allowed a much wider range of conformational sampling compared with the bias toward extended forms with original parameters. A diverse range of structures agrees with the broad distribution of ATP conformations in proteins deposited in the Protein Data Bank. These simulations suggest that the modified parameters will be useful in studies of ATP in solution and of the many ATP-utilizing proteins.
We have developed a new hybrid (MPI+OpenMP) parallelization scheme for molecular dynamics (MD) simulations by combining a cell-wise version of the midpoint method with pair-wise Verlet lists. In this scheme, which we call the midpoint cell method, simulation space is divided into subdomains, each of which is assigned to a MPI processor. Each subdomain is further divided into small cells. The interaction between two particles existing in different cells is computed in the subdomain containing the midpoint cell of the two cells where the particles reside. In each MPI processor, cell pairs are distributed over OpenMP threads for shared memory parallelization. The midpoint cell method keeps the advantages of the original midpoint method, while filtering out unnecessary calculations of midpoint checking for all the particle pairs by single midpoint cell determination prior to MD simulations. Distributing cell pairs over OpenMP threads allows for more efficient shared memory parallelization compared with distributing atom indices over threads. Furthermore, cell grouping of particle data makes better memory access, reducing the number of cache misses. The parallel performance of the midpoint cell method on the K computer showed scalability up to 512 and 32,768 cores for systems of 20,000 and 1 million atoms, respectively. One MD time step for long-range interactions could be calculated within 4.5 ms even for a 1 million atoms system with particle-mesh Ewald electrostatics.
Conformational sampling is fundamentally important for simulating complex bio-molecular systems. The generalized-ensemble algorithm, especially the temperature replica-exchange molecular dynamics method (T-REMD), is one of the most powerful methods to explore structures of bio-molecules such as proteins, nucleic acids, carbohydrates, and also of lipid membranes. T-REMD simulations have focused on soluble proteins rather than membrane proteins or lipid bilayers, because explicit membranes do not keep their structural integrity at high temperature. Here, we propose a new generalized-ensemble algorithm for membrane systems, which we call the surface-tension REMD method. Each replica is simulated in the NPγT ensemble, and surface tensions in a pair of replicas are exchanged at certain intervals to enhance conformational sampling of the target membrane system. We test the method on two biological membrane systems: a fully hydrated DPPC (1,2-dipalmitoyl-sn-glycero-3-phosphatidylcholine) lipid bilayer and a WALP23–POPC (1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine) membrane system. During these simulations, a random walk in surface tension space is realized. Large-scale lateral deformation (shrinking and stretching) of the membranes takes place in all of the replicas without collapse of the lipid bilayer structure. There is accelerated lateral diffusion of DPPC lipid molecules compared with conventional MD simulation, and a much wider range of tilt angle of the WALP23 peptide is sampled due to large deformation of the POPC lipid bilayer and through peptide-lipid interactions. Our method could be applicable to a wide variety of biological membrane systems.
The major bottleneck in molecular dynamics (MD) simulations of biomolecules exist in the calculation of pairwise nonbonded interactions like Lennard-Jones and long-range electrostatic interactions. Particle-mesh Ewald (PME) method is able to evaluate long-range electrostatic interactions accurately and quickly during MD simulation. However, the evaluation of energy and gradient includes time-consuming inverse square roots and complementary error functions. To avoid such time-consuming operations while keeping accuracy, we propose a new lookup table for short-range interaction in PME by defining energy and gradient as a linear function of inverse distance squared. In our lookup table approach, densities of table points are inversely proportional to squared pair distances, enabling accurate evaluation of energy and gradient at small pair distances. Regardless of the inverse operation here, the new lookup table scheme allows fast pairwise nonbonded calculations owing to efficient usage of cache memory.
The effect of cellular crowding environments on protein structure and stability is a key issue in molecular and cellular biology. The classical view of crowding emphasizes the volume exclusion effect that generally favors compact, native states. Here, results from molecular dynamics simulations and NMR experiments show that protein crowders may destabilize native states via protein–protein interactions. In the model system considered here, mixtures of villin head piece and protein G at high concentrations, villin structures become increasingly destabilized upon increasing crowder concentrations. The denatured states observed in the simulation involve partial unfolding as well as more subtle conformational shifts. The unfolded states remain overall compact and only partially overlap with unfolded ensembles at high temperature and in the presence of urea. NMR measurements on the same systems confirm structural changes upon crowding based on changes of chemical shifts relative to dilute conditions. An analysis of protein–protein interactions and energetic aspects suggests the importance of enthalpic and solvation contributions to the crowding free energies that challenge an entropic-centered view of crowding effects.
The nudged elastic band (NEB) and string methods are widely used to obtain the reaction path of chemical reactions and phase transitions. In these methods, however, it is difficult to define an accurate Lagrangian to generate the conservative forces. On the other hand, the constrained optimization with locally updated planes (CO-LUP) scheme defines target function properly and suitable for micro-iteration optimizations in quantum mechanical/molecular mechanical (QM/MM) systems, which uses the efficient second order QM optimization. However, the method does have problems of inaccurate estimation of reactions and inappropriate accumulation of images around the energy minimum. We introduce three modifications into the CO-LUP scheme to overcome these problems: (1) An improved tangent estimation of the reaction path, which is used in the NEB method, (2) redistribution of images using an energy-weighted interpolation before updating local tangents, and (3) reduction of the number of constraints, in particular translation/rotation constraints, for improved convergence. First, we test the method on the isomerization of alanine dipeptide without QM/MM calculation, showing that the method is comparable to the string method both in accuracy and efficiency. Next, we apply the method for defining the reaction paths of the rearrangement reaction catalyzed by chorismate mutase (CM) and of the phosphoryl transfer reaction catalyzed by cAMP-dependent protein kinase (PKA) using generalized hybrid orbital QM/MM calculations. The reaction energy barrier of CM is in high agreement with the experimental value. The path of PKA reveals that the enzyme reaction is associative and there is a late transfer of the substrate proton to Asp 166, which is in agreement with the recently published result using the NEB method.
Large-scale conformational changes in proteins involve barrier-crossing transitions on the complex free energy surfaces of high-dimensional space. Such rare events cannot be efficiently captured by conventional molecular dynamics simulations. Here we show that, by combining the on-the-fly string method and the multi-state Bennett acceptance ratio (MBAR) method, the free energy profile of a conformational transition pathway in Escherichia coli adenylate kinase can be characterized in a high-dimensional space. The minimum free energy paths of the conformational transitions in adenylate kinase were explored by the on-the-fly string method in 20-dimensional space spanned by the 20 largest-amplitude principal modes, and the free energy and various kinds of average physical quantities along the pathways were successfully evaluated by the MBAR method. The influence of ligand binding on the pathways was characterized in terms of rigid-body motions of the lid-shaped ATP-binding domain (LID) and the AMP-binding (AMPbd) domains. It was found that the LID domain was able to partially close without the ligand, while the closure of the AMPbd domain required the ligand binding. The transition state ensemble of the ligand bound form was identified as those structures characterized by highly specific binding of the ligand to the AMPbd domain, and was validated by unrestrained MD simulations. It was also found that complete closure of the LID domain required the dehydration of solvents around the P-loop. These findings suggest that the interplay of the two different types of domain motion is an essential feature in the conformational transition of the enzyme.
The effect of protein crowding on the structure and dynamics of water was examined from explicit solvent molecular dynamics simulations of a series of protein G and protein G/villin systems at different protein concentrations. Hydration structure was analyzed in terms of radial distribution functions, three-dimensional hydration sites, and preservation of tetrahedral coordination. Analysis of hydration dynamics focused on self-diffusion rates and dielectric constants as a function of crowding. The results show significant changes in both structure and dynamics of water under highly crowded conditions. The structure of water is altered mostly beyond the first solvation shell. Diffusion rates and dielectric constants are significantly reduced following linear trends as a function of crowding reflecting highly constrained water in crowded environments. The reduced dynamics of diffusion is expected to be strongly related to hydrodynamic properties of crowded cellular environments while the reduced dielectric constant under crowded conditions has implications for the stability of biomolecules in crowded environments. The results from this study suggest a prescription for modeling solvation in simulations of cellular environments.
Reversible phosphorylation of proteins is a post-translational modification that regulates diverse biological processes. The molecular mechanism underlying phosphoryl transfer catalyzed by enzymes remains a subject of active debate. In particular, the nature of transition state (TS), whether it has an associative or dissociative character, has been one of the most controversial issues. Structural evidence supports an associative TS, whereas physical organic studies point to a dissociative character. Here we perform hybrid quantum mechanics/molecular mechanics simulations for the reversible phosphorylation of phosphoserine phosphatase (PSP) to study the nature of the TS. Both phosphorylation and dephosphorylation reactions are investigated based on the two-dimensional energy surfaces along phosphoryl and proton transfer coordinates. The structures of the active site at TS in both reactions reveal compact geometries, consistent with crystal structures of PSP with phosphate analogues. On the other hand, the electron density of the phosphoryl group in both TS structures slightly decreases compared with that in the reactant states. These findings suggest that the TS of PSP has a geometrically associative yet electronically dissociative character and strongly depends on proton transfer being coupled with phosphoryl transfer. Structure and literature database searches on phosphotransferases suggest that such a hybrid TS is consistent with many structures and physical organic studies and likely holds for most enzymes catalyzing phosphoryl transfer.
An analytic gradient expression is formulated and implemented for the second-order Møller–Plesset perturbation theory (MP2) based on the generalized hybrid orbital QM/MM method. The method enables us to obtain an accurate geometry at a reasonable computational cost. The performance of the method is assessed for various isomers of alanine dipepetide. We also compare the optimized structures of fumaramide-derived [2]rotaxane and cAMP-dependent protein kinase with experiment.