5.2. Electrostatic potential fitting¶
5.2.1. Introduction¶
The general idea behind the electrostatic potential (ESP) fitting tools in HORTON is to allow:
- ESP-fitting: Fitting atomic charges to reproduce the actual molecular ESP, and
- ESP-testing: Assessing how well a given set of charges reproduce the actual molecular ESP
in isolated and 3D-periodic systems with compatible methodologies. These procedures are broken into small steps, such that you can carry out more specialized ESP fittings.
These two computations, fitting charges and testing the quality of given charges to reproduce the ESP, can be carried out once an ESP cost function is constructed. In HORTON, this ESP cost function takes the following form:
where the symbols have the following meaning:
- \(\{q\}\): the set of the atomic charges
- \(V\): the volume where the point-charge ESP (approximated molecular ESP) is compared to the reference ESP (actual molecular ESP). In isolated systems, this is some volume surrounding the molecule. In periodic systems, this is the volume of a single primitive cell.
- \(w(\mathbf{r})\): the weight function with range [0, 1] that selects the relevant parts of the volume for the ESP fit. In HORTON, the weight function of Hu, Lu and Yang is used. [hu2007] This weight function is zero far away from and inside the atoms. It becomes one in the part of the electron density tail where non-bonding contacts are typically found. The weight function smoothly varies from 0 to 1. In the Hu-Lu-Yang paper, a pro-density is recommended for this purpose while we recommend the use of a proper all-electron density or a corrected pseudo-density.
- \(V_\text{ai}(\mathbf{r})\): the actual molecular ESP from ab initio or DFT calculations obtained from programs, like Gaussian, CP2K, VASP, etc.
- \(q_A\): the charge of atom \(A\).
- \(V_\text{point}(\mathbf{r} - \mathbf{R}_A)\): the ESP due to a unit point charge. In an isolated molecule, this is simply \(1/|\mathbf{r} - \mathbf{R}_A|\) (in atomic units). For periodic systems, this part is quite a bit more involved.
- \(\Delta V_\text{ref}\): constant that may account for differences in reference between the ab initio ESP and the point charge ESP. The need for such a term was established in the REPEAT paper for ESP fitting in 3D periodic systems. [campana2009]
This approach differs from traditional ESP fitting methods (RESP, MKS, CHELPG) in the sense that the cost function is defined as an integral rather than a sum over sample points. This idea comes from the Hu-Lu-Yang paper. [hu2007] The main difference with the REPEAT method [campana2009] is that the selected volume for the fit is defined by a smooth weight function. When this weight function is based on an all-electron density, there is no need to define atomic radii as is done usually in ESP fitting.
The cost function is obviously a quadratic function of the unknown parameters, \(X\), the charges (and \(\Delta V_\text{ref}\) in the case of a 3D periodic system), which we can always write as follows:
The script horton-esp-cost.py
constructs the matrix \(A\), the vector
\(B\) and the constant \(C\). These results are stored in an HDF5 file
that is subsequently used by the horton-esp-fit.py
script to obtain
ESP-fitted charges, or by the horton-esp-test.py
script to evaluate the
cost function for a given set of charges.
In the output of these three scripts, the following quantities are reported (if applicable):
- \(\text{RMSD} = \sqrt{\text{COST}/V_w}\)
- \(\text{RRMSD} = \sqrt{\text{COST}/C}\) (expressed as a percentage)
where the cost is computed with a certain set of charges (and the optimal \(\Delta V_\text{ref}\) when applicable). The parameter \(V_w\) is the integral of the weight function. When these charges are the ESP-fitted charges, \(\text{COST}\), \(\text{RMSD}\) and \(\text{RRMSD}\) are minimal. When the charges are all set to zero, you obtain the worst-case value for \(\text{COST}=C\), \(\text{RMSD}=\sqrt{C/V_w}\) and \(\text{RRMSD}=100\%\). (You can still obtain a higher \(\text{COST}\), but only when using completely insensible charges. When that is the case, it is better to set the charges equal to zero to model the ESP.)
5.2.2. horton-esp-cost.py
– Set up an ESP cost function¶
The horton-esp-cost.py
script can be used to construct the cost function for
the ESP fitting or charge testing.
For an all-electron calculation, it is recommended to use:
horton-esp-cost.py esp.cube cost.h5 --wdens=rho.cube --pbc={000|111}
where esp.cube
is a Gaussian cube file containing the actual molecular ESP on a grid. The results will
be written in cost.h5
. The option --wdens=rho.cube
implies that the
weight function is constructed according to the Hu-Lu-Yang method [hu2007] using the given
density cube file. The option --pbc
can be used to construct a cost
function for an isolated system (000
) or a 3D periodic system (111
).
When a pseudo-potential computation is used, the density cube file contains regions of low electron density close to the nucleus. These regions may not be excluded from the fit with the Hu-Lu-Yang weight function. Therefore, HORTON allows you to build the weight function as a product of several factors: \(w(\mathbf{r}) = w_1(\mathbf{r})w_2(\mathbf{r})w_3(\mathbf{r})w_4(\mathbf{r}) \ldots\), where the first one is typically the Hu-Lu-Yang weight function and additional weight functions can be included for every pseudo-core,
For a pseudo-potential calculation, it is recommended to use:
horton-esp-cost.py esp.cube cost.h5 --wdens=rho.cube --pbc={000|111} \
--wnear Z1:r1:gamma1 [Z2:r2:gamma2 ...]
where the new option, --wnear
, takes at least one
argument. This argument must be presented for every element in the system for
which a pseudo potential is used. The parameter Z1
is the atomic number of atom 1. The
parameter r1
is the radius of the core region that you would like to
exclude for atom 1. The parameter gamma1
determines how quickly the weight factor for
elements Z1
switches from 0 (inside a sphere with radius r1
) to 1
(outside a sphere with radius r1
). Both r1
and gamma1
must be given
in angstrom.
The horton-esp-cost.py
script has several more options. As discussed in the
Running HORTON as horton-*.py scripts section, the commands below describe
the arguments that each script take:
horton-esp-cost.py --help
horton-esp-fit.py --help
horton-esp-test.py --help
horton-cubehead.py --help
5.2.3. horton-esp-fit.py
– Fit atomic charges to the ESP¶
Once a cost function is constructed, it can be used to estimate the atomic charges by minimizing the cost function. A bare-bones fit can be carried out as follows:
horton-esp-fit.py cost.h5 charges.h5
where the file cost.h5
is the file generated by the
horton-esp-cost.py
script.
Useful ESP-fitted charges typically involve a more advanced minimizations of
the ESP cost function, for example, by adding constraints, restraints,
transforming to bond-charge increments, fitting to several different molecules
concurrently, etc. Such advanced features are not supported in
horton-esp-fit.py
script, but you can implement these in customized scripts
that use the the ESP cost functions obtained from the horton-esp-cost.py
script.
5.2.4. horton-esp-test.py
– Test the ESP quality of a given set of charges¶
The horton-esp-test.py
script can be used to test the quality of a given set of
charges to reproduce the ESP. These charges are typically obtained
with horton-wpart.py
or horton-cpart.py
. This script can be used as follows:
horton-esp-test.py cost.h5 wpart.h5:hi/charges wpart_espcost.h5:hi
The first file, cost.h5
, contains the cost function and is generated with the
horton-esp-cost.py
script. The second file, wpart.h5
, contains the given atomic chanrges
and, for example, is generated with horton-wpart.py gaussian.fchk wpart.h5:hi hi atoms.h5
. The last file,
wpart_espcost.h5
, will contain the assessment results in the HDF5 group hi
.
5.2.5. Making nice cube files with Gaussian¶
HORTON contains an auxiliary tool, horton-cubehead.py
, to prepare an input
header for a cube file aligned with the molecule of interest. This is more
efficient than the default settings of cubegen, which makes a significant difference in
disk space when working with molecular databases. For occasional use,
horton-cubehead.py
is probably an overkill. The script is used as follows:
horton-cubehead.py structure.xyz cubehead.txt
The cubehead.txt
file will contain something along the following lines:
0 16.5695742234 -2.4411573645 -11.3378429796
-61 -0.0000100512 0.0000288090 0.3779452256
61 -0.2210334948 0.3065726468 -0.0000292468
65 -0.3065726480 -0.2210334949 0.0000086952
This file can be used for the cubegen utility as follows:
cubegen 0 fdensity=scf somefile.fchk rho.cube -1 < cubehead.txt
cubegen 0 potential=scf somefile.fchk esp.cube -1 < cubehead.txt
where scf
must be replaced by the type of wavefunction to be analyzed. Read
the cubegen manual for
more details.
5.2.6. Making cube files with CP2K¶
CP2K can generate cube files for periodic systems. You have to use the input sections E_DENSITY_CUBE and V_HARTREE_CUBE for density and ESP cube files, respectively. (The name V_HARTREE_CUBE
is misleading. CP2K will print out the potential due to the electrons and the nuclei felt by a point charge with charge \(-e\).)
5.2.7. Python interface to the ESP fitting code¶
You can use the ESP cost function constructed by horton-esp-cost.py
script to
implement customized charge fitting protocols, e.g. using bond-charge
increments, constraints or hyperbolic restraints. At the beginning of such
a customized script, the cost function can be loaded as follows:
cost = load_h5("cost.h5")['cost']
The cost
object is an instance of the
ESPCost
class. This
instance can, for example, be used to evaluate the ESP cost or its gradient for a
given array of atomic charges:
print cost.value(charges)
print cost.gradient(charges)
If desired, you can also directly access \(A\), \(B\), \(C\) that define the quadratic cost functions:
print cost._A
print cost._B
print cost._C
For more information, please refer to Introduction.