Innovatives Supercomputing in Deutschland
inSiDE • Vol. 11 No. 2 • Autumn 2013
current edition
archive
centers
events
download
about inSiDE
index  index prev  prev next  next

A scalable hybrid DFT/PMM-MD Approach for accurately simulating Biomolecules on SuperMUC

Life sciences, from fundamental research to medical drug design, crave for detailed knowledge about the structural and dynamical properties of biomolecules. Ligand-receptor interactions, for instance, involve a complex interplay between molecules, which needs to be understood in order to selectively influence it, e.g. by medical drugs. Experiments can of course help to answer such questions, but access to the molecular world is still difficult and theoretical modeling inevitable.

Infrared (IR) spectroscopy represents one of the few techniques to monitor the functional dynamics of biomolecules such as peptides and proteins. In order to reveal the structural changes of molecules encoded in such spectra, molecular dynamics (MD) simulations are a valuable tool that complements experimental results and helps to understand them. Molecular mechanics (MM) force fields enable MD simulations of large systems, such as a protein in solution comprising several ten thousand atoms, up to a microsecond time scale. However, such MM-MD simulations are by far not accurate enough for tasks like calculating IR spectra. In contrast, high-level quantum mechanical (QM) methods like density functional theory (DFT) provide the required accuracy, but are computationally limited to much smaller length and time scales. Thus, a sufficiently long simulation of even a small peptide in an extended aqueous environment is still beyond the scope of such QM methods.

Figure 1: A typical QM/MM simulation setup consisting of an alanin-dipeptide molecule (described by QM) solvated in about 2,000 MM water molecules. The close-up shows the MM solvation shell of the solute molecule together with a sketch of its electron density, which is accessible only through the use of a QM method.

Infrared (IR) spectroscopy represents one of the few techniques to monitor the functional dynamics of biomolecules such as peptides and proteins. In order to reveal the structural changes of molecules encoded in such spectra, molecular dynamics (MD) simulations are a valuable tool that complements experimental results and helps to understand them. Molecular mechanics (MM) force fields enable MD simulations of large systems, such as a protein in solution comprising several ten thousand atoms, up to a microsecond time scale. However, such MM-MD simulations are by far not accurate enough for tasks like calculating IR spectra. In contrast, high-level quantum mechanical (QM) methods like density functional theory (DFT) provide the required accuracy, but are computationally limited to much smaller length and time scales. Thus, a sufficiently long simulation of even a small peptide in an extended aqueous environment is still beyond the scope of such QM methods.

These issues are resolved by hybrid MD approaches, which combine a QM treatment of a small subsystem with a MM description of its environment (cf. Fig. 1). This combination of an accurate, but rather slow QM description of e.g. a single (bio-)molecule with a more coarse, but efficient MM treatment of its surroundings enables MD simulations with good overall accuracy and on greater length and time scales than pure QM methods. QM/MM hybrid methods have become a standard tool in life sciences for applications like studying structure and energetics of enzyme reactions, excited-state properties or charge-transfer processes [1].

QM/MM approaches typically combine different programs for the MM and QM calculations, which need to be properly interfaced. Especially the efficient evaluation of the long-ranged electrostatic interactions between many thousand MM partial charges and the electrons treated explicitly in the QM part pose a major computational challenge.

Many current approaches are hampered by the neglect of the important polarization effects in the MM force field, and, therefore, do not yield the desired accuracy to, for example, calculate spectroscopic properties of biomolecules. Recently, a combination of a highly accurate DFT description with a polarizable MM (PMM) description of the condensed phase environment was developed in our group [2]. This new approach greatly enhances the accuracy compared to conventional unpolarizable hybrid MD approaches. It treats the DFT/PMM electrostatics, which additionally covers interactions of the inducible PMM dipoles with the DFT electron density, by a linearly scaling fast multipole method (SAMM4 [3]). The drastically reduced computational complexity makes studies of large DFT molecules solvated in accurately modeled extended PMM condensed phase now feasible.

Figure 2: Scaling of IPHIGENIE/CPMD on SuperMUC for pure MPI (black) and MPI/OpenMP parallelization (colored).

We tested the strong scaling properties of our implementation on SuperMUC with a comparably small setup comprising an alanin-dipeptide molecule (22 atoms) as DFT fragment in a periodic box filled with 2,112 polarizable water molecules (PMM model TL4P), resulting in a total atom count of 8,470. Fig. 2 demonstrates that even for this small setting, an almost perfect scaling is achieved up to 512 cores, with still reasonable performance gain up to 2,048 cores. In this setup, the IPHIGENIE part scales up to 128 MPI processes, corresponding to as few as 66 atoms per MPI process. Less than 8 % of the total computation time is spent on the PMM part, and the well-known excellent scaling of CPMD is not hampered by the interface to IPHIGENIE. Several tens of picoseconds of DFT/PMM trajectory per day are now feasible (cf. Fig. 3) and we are currently setting up large-scale DFT/PMM simulations of relevant biomolecules such as DNA bases on SuperMUC.

Figure 3: Absolute speed of DFT/PMM calculation involving 22 DFT atoms and 2,112 PMM waters (integration time step 0.5 fs).

As a sample application, we have computed two 20 ps DFT/PMM trajectories of the above system from which we calculated the IR spectrum [5] of the alanine-dipeptide in water. Fig. 4 shows the relative absorption and reveals the characteristic frequency bands Amide I and II of the left and right side of the molecule, whose positions depend on the conformation of the molecule and can thus serve to identify its current structure. As the two insets show, full insight is gained into the underlying dynamics and structure. Averaging a few of such spectra, which are rapidly calculated with our DFT/PMM approach on SuperMUC, yields accurate IR spectra of the molecule in the condensed phase helping to interpret experimental results.

Concerning future development, we are currently testing a DFT/PMM replica exchange scheme, which adds another level of parallelism to the program. The computational power of several 10,000's of cores can then easily be combined in such a calculation for rapid conformational sampling of the molecule at high accuracy.

Figure 4: IR spectrum of alanin-dipeptide (treated with MT/BLYP, Ecut=80 Ry) solvated in PMM water. The insets show the molecular movement associated with the absorption bands.

References

[1] Senn, H.M., Thiel, W. Angew. Chem., Int. Ed. 48, 1198, 2009

[2] Schwörer, M., Breitenfeld, B., Tröster, P., Bauer, S., Lorenzen, K., Tavan, P., Mathias, G. J. Chem. Phys. 138, 244103, 2013

[3] Lorenzen, K., Schwörer, M., Tröster, P., Mates, S., Tavan, P. J. Chem. Theory Comput. 8, 3628, 2012

[4] CPMD, http://www.cpmd.org, Copyright IBM Corp 1990-2008, Copyright MPI für Festkörperforschung Stuttgart, 1997-2001

[5] Mathias, G., Baer, M.D. J. Chem. Theory Comput. 7, 2028, 2011

• Helmut Satzger
Leibniz Supercomputing Centre

• Gerald Mathias
• Magnus Schwörer
Lehrstuhl für BioMolekulare Optik, LMU München


top  top