QM/MM tutorial

QM/MM calculations on a Diels-Alder antibody catalyst.

IV. Optimization of product, reactant and transition state geometries in the fully solvated protein, using Linear Transit in gromacs

We have seen that in water the energy of the transition state is lower that in vacuo, with respect to the reactants. Now we will calculate the energy curve in the protein to see the effect of the protein environment on the reaction.

QM/MM subdivision

The fully solvated protein system is too large for even a semi-empirical QM calculation. Therefore, we resort to a QM/MM description of the system. The way we split up the system in a small QM part and a much bigger MM part, is shown in figure 7. The QM part consists of the same atoms as before and is again described at the semi-empirical PM3 level of theory. The remainder of the system, consisting of the tail part (-R, figure 1) of the reactants, the protein, the water molecules and the chloride ions, is modeled with the GROMOS96 forcefield.

Figure 7. Division of the system in a QM subsystem and an MM subsystem. The QM subsystem is described at the semi-empirical QM level,
while the remainder of the system, consisting of the reactants-aliphatic tail, protein, water molecules and ions is modeled with the GROMOS96
forcefield.

Finding product, reactant and transition state geometries in the fully solvated protein

The starting structure for this calculation is the x-ray model of the 1CE catalytic antibody by Xu et al.. Remember that in the x-ray model, the -R group of the analogue was not resolved. So we need to add it ourselves. We take the modified transition-state analogue of part II of this tutorial and fit it onto the analogue in the x-ray model. After the fit, we minimize the tail part, keeping the rest of the protein fixed.

Then we place this modified protein model in a periodic box, fill that box with water and equilibrate the water. Subsequently, we add 6 Cl- ions to compensate the overall net charge of -6 on the protein and equilibrate again. The procedure of preparing the system for the QM/MM geometry optimization is straightforward, but time-consuming. Therefore, we skip fitting and equilibrating and use the result (confin.gro) instead. An outline of the preparation is avalaible here.

• We will work in a new subdirectory, we call LTprotein.

• create the LTprotein subdirectory and go into that directory:

localhost:~>mkdir LTprotein

localhost:~>cd LTprotein

• In addition to a starting structure, we need to create an index file, an mdp file and a topology file. The topology files we use here, were generate automatically with the pdb2gmx programm of gromacs. This is described in the outline of the preparation procedure, which we decided to skip. We will the prepared files instead:

Note we will not use the topol_A.itp, but instead create a new one with a different constraint length for the reaction coordinate for every point of the Linear Transit.

• download all these files and the scripts into the LTprotein subdirectory.

• Now that we have the input files and scripts, we will perform the Linear Transit calculation. We let the reaction coordinate vary in two stages again. First, we go from 0.15 to 0.4 nm in 200 steps. Then we do an additional calculation in which we go down from 0.15 to 0.12 in 50 steps.

• create two additional subdirectories in the LTprotein subdirectory, one for increasing and one for the decreasing the reaction coordinate:

localhost:~>mkdir up

localhost:~>mkdir down

• go into the up subdir and execute the create_tops.scr scripts to create 201 subdirectories (called step_0, ..., step_200) and create a topol_A.itp with different constraints lengths in the range 0.12 to 0.5 nm.

localhost:~>cd up

localhost:~>../create_tops.scr 0.15 0.4 200

• execute the run.scr scripts to run the minimizations in the different subdirectories (called step_0, ..., step_200). To speed up the convergence, the script takes the output coordinates of the previous Linear Transit point as input in the current minimization

localhost:~>../run.scr 0.15 0.4 200

This will take a while, depending on the speed of your computer system.

• When the run script is done, we execute the get_ener.scr script to retrieve the energies from the individual linear transit points. Redirect the output to a file, let's call it eqmmm.xvg.

localhost:~>../get_ener.scr 0.12 0.4 200 > eqmmm.xvg

• Next we go back one subdir back and enter the down subdir to do the other linear transit.

localhost:~>cd ../down

and execute the create_tops.scr scripts again to create 51 subdirectories (called step_0, ..., step_50) and create a topol_A.itp with different constraints lengths in the range 0.15 to 0.12 nm.

localhost:~>../create_tops.scr 0.15 0.12 50

• execute the run.scr scripts to run the minimizations in the different subdirectories (called step_0, ..., step_50). To speed up the convergence, the script takes the output coordinates of the previous Linear Transit point as input in the current minimization

localhost:~>../run.scr 0.15 0.12 50

This will take a while again.

• afterwards, we execute the get_ener.scr script again, to retrieve the energies from the individual linear transit points. Redirect the output to a file, let's call it eqmmm.xvg.

localhost:~>../get_ener.scr 0.15 0.12 50 > eqmmm.xvg

• When both stages of the Linear Transit are done, we can see the results.

• Go back one subdir and import up/eqmmm.xvg and down/eqmmm.xvg in xmgr to see the energy as a function of the reaction coordinate.

localhost:~>xmgr up/eqmmm.xvg down/eqmmm.xvg

The maximum is at 0.21 nm (-1102.9 kJ/mol). This corresponds to point 48 in the up subdirectory. Let's have a look at that structure. We simply go into the directory up/step_48 and use editconf to convert the confout.gro into confout.pdb

localhost:~>editconf -f confout.gro -o confout.pdb

We then use rasmol to visualize the structure:

localhost:~>rasmol confout.pdb

We zoom a bit in on the active site, showing interacting residues and water molecules:

• RasMol>restrict within(5.0,dat) AND water
• RasMol>wireframe 30
• RasMol>select dat
• RasMol>wireframe 50
• RasMol>center dat
• RasMol>zoom 600
• RasMol>select backbone AND NOT *.O
• RasMol>cartoons
• RasMol>colour structure
• RasMol>select 266,271
• RasMol>wireframe 40

The minima are at 0.306250 nm (-1216.4 kJ/mol) and 0.15 nm (-1360.8 kJ/mol) and correspond to the reactant and product states respectively. Figure 8 shows the reactant, transition and product states of the system.

Figure 8. Reactant (a), Transition State (b) and Product (c) geometries in the active site of the catalytic antibody, found with the Linear
Transit method. The energies of these structures are listed in table 4.

Now, we plot the energy curves in vacuo, water and te protein all in the same figure. We manipulate the offsets of the curves to make the comparison easier.

The protein is is stabilizing the transition state relative to reactants even more that the water. The reaction rate therefore should be highest in the protein. Note however that alse here the entropic contribution is not included. The reactants need to be bound by the protein first to form the reactive protein-substrate complex. These steps can be studied with different simulation techniques.

Conclusions

Table 4. Energies of the reactant,
transition state and product geom-
etries in the solvated protein at the
PM3/GROMOS96 QM/MM level.
The last column lists the energy
differences with respect to the
reactant state.

 E (kJ/mol) ΔE (kJ/mol) Reactant -1216.4 0.0 Trans. St. -1102.9 113.5 Product -1360.8 -144.4

updated 07/09/04