..
DOCUMENTATION BUILT FROM RELEASE: 2.1.1-1-g3a5adfea (Jan 18, 2019)
..
: HORTON: Helpful Open-source Research TOol for N-fermion systems.
: Copyright (C) 2011-2019 The HORTON Development Team
:
: This file is part of HORTON.
:
: HORTON is free software; you can redistribute it and/or
: modify it under the terms of the GNU General Public License
: as published by the Free Software Foundation; either version 3
: of the License, or (at your option) any later version.
:
: HORTON is distributed in the hope that it will be useful,
: but WITHOUT ANY WARRANTY; without even the implied warranty of
: MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
: GNU General Public License for more details.
:
: You should have received a copy of the GNU General Public License
: along with this program; if not, see
:
: --
Atoms-in-Molecules (AIM) analysis
#################################
Introduction
============
HORTON supports several real-space atoms-in-molecules (AIM) partitioning schemes. For this
purpose, a wavefunction or an electron density on a grid can be loaded from several file
formats. An overview of all supported file formats can be found here:
:ref:`ref_file_formats`. For a given wavefunction or cube file, several AIM partitioning
schemes can be used to derive AIM observables, e.g. atomic charge, atomic multipole
expansion, etc. An overview of features is given in the following table.
================================== ============
Feature References
================================== ============
**Boundary conditions**
0D
1D
2D
3D
**AIM schemes**
Becke [becke1988_multicenter]_
Hirshfeld [hirshfeld1977]_
Iterative Hirshfeld [bultinck2007]_
Iterative Stockholder [lillestolen2008]_
Minimal Basis Iterative Stockolder [verstraelen2016]_
**AIM properties**
Charges
Spin charges
Cartesian multipoles
Pure/harmonic multipoles
Radial moments
Dispersion coefficients [tkatchenko2009]_
Spherical decomposition
ESP due to each AIM [becke1988_poisson]_
ESP due to pro-atom [becke1988_poisson]_
Wiberg bond order
Kohn-Sham response
**Extra options**
Symmetry analysis
================================== ============
For almost all the Hirshfeld variants, you must first set up a pro-atom database
densities, preferably at the same level of theory used for the molecular
computation. The ``horton-atomdb.py`` script facilitates setting up such a
database.
All of the outputs generated by HORTON are written to `HDF5
`_ files (with extension ``.h5``). These are
binary (compact) platform-independent files that can be post-processed easily
with Python scripts. The ``horton-hdf2csv.py`` script can be used to convert
(part of) an HDF5 file into the "comma-separated value" (CSV) format, which is
supported by most spreadsheet software.
The usage of the three scripts (``horton-atomdb.py``, ``horton-wpart.py``, and
``horton-hdf2csv.py``) will be discussed in :ref:`atomdb`, :ref:`wpart`, and
:ref:`hdf2csv`, respectively. All scripts have a ``--help`` flag that prints out their
complete list of command-line arguments. The penultimate section, :ref:`partition`, shows
how to use the partitioning code through the Python interface. The last section,
:ref:`faq`, answers some frequently asked questions regarding AIM partitioning with
HORTON.
.. _atomdb:
``horton-atomdb.py`` -- Build a pro-atom database
==================================================
The ``horton-atomdb.py`` script should be used in three steps to build a pro-atom database:
1) **Generate input files** for isolated atom computations using one of the
following programs: Gaussian03/09, Orca, PSI4 or CP2K. The following example
generates Gaussian09 input files for hydrogen, carbon, nitrogen and oxygen::
horton-atomdb.py input g09 1,6-8 template.com
The ``template.com`` file is used to generate the input files and will
be discussed in detail below. A series of directories is created with input
files for the atomic computations, like ``001__h_001_q+00``, ``001__h_002_q-01``,
``001__h_003_q-02``, etc. Optional arguments can be used to control the range
of cations and anions, the spin multiplicities, etc. Also a
``run_g09.sh`` script is generated to take care of the next step.
Run the command below to obtain a complete list of all arguments:
.. code-block:: bash
horton-atomdb.py input --help
2) **Run the atomic computations** by executing the ``run_PROGRAM.sh`` script.
In this case::
./run_g09.sh
Note that the ``run_PROGRAM.sh`` scripts assume a default installation of
the corresponding software. In the case of non-standard installations or
when special calls or environment variables are needed, you must modify this
script first. (It will not be overwritten when ``horton-atomdb.py input ...``
is executed again.)
3) **Convert the output files** of the external programs into a database of
spherically averaged pro-atom densities (``atoms.h5``). Just run::
./horton-atomdb.py convert
This script also generates figures of the radial densities and Fukui
functions, if ``matplotlib`` is installed. In this step, you may use
:ref:`the -- grid option `, although the default setting
should be fine for nearly all cases.
If you remove some directories for atomic computations before or after
executing the ``run_PROGRAM.sh`` script, the corresponding atoms will not be
included in the database. Similarly, you may rerun ``horton-atomdb.py input ...``
to generate more input files. In that case ``run_PROGRAM.sh`` will only consider
the atomic computations that have not been completed yet. In some cases, the
``run_PROGRAM.sh`` script needs to be customized, e.g. you may want to use
``mpirun`` to run the atomic computations in parallel. When such modifications
are made, subsequent runs of ``horton-atomdb.py input ...`` will not overwrite
the ``run_PROGRAM.sh`` script.
Template files
--------------
A template file is simply an input file for an atomic computation, where the
distinguishing parameters (element, charge, ...) are replaced with keys that are
recognized by the input generator of ``horton-atomdb.py``. These keys are:
* ``${element}``: The element symbol of the atom.
* ``${number}``: The atomic number of the atom.
* ``${charge}``: The charge of the atom (or cation, or anion).
* ``${mult}``: The spin multiplicity of the atom.
For more advanced cases, you may include (parts of) other files with generic
keys, e.g. for basis sets that are different for every element:
* ``${file:filename}``: This is replaced by the contents of
``filename.NNN_PPP_MM``, where ``NNN`` is the atomic number, ``PPP`` is the
atomic population and ``MM`` is the spin multiplicity. These numbers are
left-padded with zeros to fix the the length. If a field in the filename is
zero, it is considered as a wild card. For example, you may use
``${file:basis}`` in the template file and store a basis set specification for
oxygen in the file ``basis.008_000_00``. The basis contained in this file will be
used for all cations, anions, and multiplicities of oxygen, because the
zeros in the file name act as a wildcard.
* ``${line:filename}``: This is comparable to the previous key, except that all
replacements are stored in one file. Each (non-empty) line in this file that
starts with ``NNN_PPP_MM`` has a string that will be filled into the field of
the file that corresponds to the specified ``NNN_PPP_MM``.
None of the keys are mandatory, although ``${element}`` (or ``${number}``),
``${charge}`` and ``${mult}`` must be present to obtain sensible results.
Basic template file for Gaussian 03/09
---------------------------------------
This is a simple template file for atomic computations at the HF/3-21G level::
%chk=atom.chk
#p HF/3-21G scf(xqc)
A random title line
${charge} ${mult}
${element} 0.0 0.0 0.0
Do not forget to include an empty line at the end. Otherwise, Gaussian will
complain. The first line ``%chk=atom.chk`` is required to write the atomic
wavefunction to a file that is needed in the ``convert`` step of
the ``horton-atomdb.py`` script.
Advanced template file for Gaussian 03/09
-----------------------------------------
When custom basis sets are specified with the ``Gen`` keyword in Gaussian, you
have to use keys that include other files. For a database containing H, C and O, you
could use the following template:
* ``template.com``::
%chk=atom.chk
#p PBE1PBE/Gen scf(xqc)
A random title line
${charge} ${mult}
${element} 0.0 0.0 0.0
${file:basis}
* ``basis.001_000_00``::
H 0
6-31G(d,p)
****
* ``basis.006_000_00``::
C 0
6-31G(d,p)
****
* ``basis.008_000_00``::
O 0
S 6 1.00
8588.5000000 0.00189515
1297.2300000 0.0143859
299.2960000 0.0707320
87.3771000 0.2400010
25.6789000 0.5947970
3.7400400 0.2808020
SP 3 1.00
42.1175000 0.1138890 0.0365114
9.6283700 0.9208110 0.2371530
2.8533200 -0.00327447 0.8197020
SP 1 1.00
0.9056610 1.0000000 1.0000000
SP 1 1.00
0.2556110 1.0000000 1.0000000
SP 1 1.00
0.0845000 1.0000000 1.0000000
D 1 1.00
5.1600000 1.0000000
D 1 1.00
1.2920000 1.0000000
D 1 1.00
0.3225000 1.0000000
F 1 1.00
1.4000000 1.0000000
****
Simple template file for ORCA
-----------------------------
The following template file uses the built-in ``cc_pVQZ`` basis set of ORCA::
!HF TightSCF
%basis
Basis cc_pVQZ
end
*xyz ${charge} ${mult}
${element} 0.0 0.0 0.0
*
Template file for CP2K
----------------------
You must use ``CP2K`` version 2.4-r12857 (or newer). The computation of pro-atoms
with ``CP2K`` is more involved because you have to specify the occupation
of each subshell. The ``ATOM`` program of ``CP2K`` does not simply follow the
Aufbau rule to assign orbital occupations. At this moment, only the computation of
atomic densities with contracted basis sets and pseudopotentials are supported.
The following example can be used to generate a pro-atom database with the
elements, O, Na, Al and Si, using the GTH pseudopotential and the MolOpt basis
set.
* ``template.inp``::
&GLOBAL
PROJECT ATOM
PROGRAM_NAME ATOM
&END GLOBAL
&ATOM
ATOMIC_NUMBER ${number}
ELECTRON_CONFIGURATION (${mult}) CORE ${line:valence.inc}
CORE none
MAX_ANGULAR_MOMENTUM 1
&METHOD
METHOD_TYPE UKS
&XC
&XC_FUNCTIONAL PBE
&END XC_FUNCTIONAL
&END XC
&END METHOD
&POTENTIAL
PSEUDO_TYPE GTH
POTENTIAL_FILE_NAME ../../PBE_PSEUDOPOTENTIALS
POTENTIAL_NAME ${line:ppot.inc}
&END POTENTIAL
&PP_BASIS
BASIS_SET_FILE_NAME ../../BASIS_MOLOPT
BASIS_TYPE CONTRACTED_GTO
BASIS_SET DZVP-MOLOPT-SR-GTH
&END PP_BASIS
&PRINT
&BASIS_SET ON
&END
&ORBITALS ON
&END
&POTENTIAL ON
&END
&END
&END ATOM
* ``ppot.inc``::
008_000_00 GTH-PBE-q6
011_000_00 GTH-PBE-q9
013_000_00 GTH-PBE-q3
014_000_00 GTH-PBE-q4
* ``valence.inc``::
008_005_02 1s2 2p1
008_006_03 1s2 2p2
008_007_04 1s2 2p3
008_008_03 1s2 2p4
008_009_02 1s2 2p5
008_010_01 1s2 2p6
011_005_02 1s2 2p1
011_006_03 1s2 2p2
011_007_04 1s2 2p3
011_008_03 1s2 2p4
011_009_02 1s2 2p5
011_010_01 1s2 2p6
011_011_02 1s2 2p6 2s1
011_012_01 1s2 2p6 2s2
011_013_02 1s2 2p6 2s2 3p1
013_011_02 1s1
013_012_01 1s2
013_013_02 1s2 2p1
013_014_03 1s2 2p2
014_011_02 1s1
014_012_01 1s2
014_013_02 1s2 2p1
014_014_03 1s2 2p2
Simple template file for PSI4
-----------------------------
The following template file uses the BLYP functional and the built-in
``cc-pvdz`` basis set of PSI4::
molecule {
${charge} ${mult}
${element} 0.0 0.0 0.0
}
set {
basis cc-pvdz
scf_type df
guess sad
molden_write true
reference uhf
}
energy('blyp')
Note that the flags ``molden_write true`` and ``reference uhf`` are required.
The former instructs the SCF program to write the orbitals, which HORTON picks
up to compute atomic densities. The latter is needed because, usually, most of the
atomic computations are on open-shell systems. Because PSI4 only writes Molden
files for SCF computations, it is not possible to prepare atomic densities with
other levels of theory with PSI4.
.. _wpart:
``horton-wpart.py`` -- AIM analysis based on a wavefunction file
================================================================
The basic usage of ``horton-wpart.py`` is as follows::
horton-wpart.py wfn output.h5[:group] {b,h,hi,is,he} [atoms.h5]
This script has four important arguments:
1. The ``wfn`` argument is the wave-function file, like a Gaussian03 or 09 formatted
checkpoint file, a Molekel file or a Molden input file. Many programs can write out
the wavefunction in the Molden format, albeit with some mistakes. For ORCA, PSI4 and
Turbomole, these mistakes are detected and corrected.
2. The output file (HDF5 format) in which the partitioning results are written.
Usually, the output file has the
``.h5`` extension. The output file is optionally followed by a colon and a group
suffix. This suffix can be used to specify a group in the HDF5 output file
where the output is to be stored.
3. The third argument refers to the partitioning scheme:
* ``b``: Becke partitioning. [becke1988_multicenter]_
* ``h``: Hirshfeld partitioning. [hirshfeld1977]_
* ``hi``: Iterative Hirshfeld partitioning. [bultinck2007]_
* ``is``: Iterative Stockholder partitioning. [lillestolen2008]_
* ``mbis``: Minimal Basis Iterative Stockholder partitioning. [verstraelen2016]_
4. The ``atoms.h5`` argument is only needed for the Hirshfeld variants ``h`` and ``hi``,
not for ``b``, ``is`` or ``mbis``. The file ``atoms.h5`` contains a database of
pro-atoms and can be generated as explained in :ref:`atomdb`.
This script computes atomic weight functions, and derives all AIM observables that are
implemented for that scheme. These results are stored in the specified HDF5 output file.
An elaborate description of all command-line arguments can be obtained from:
.. code-block:: bash
horton-wpart.py --help
The integration grid can be tuned with :ref:`ref_grid_option`.
.. note::
When a post-Hartree-Fock (post-HF) level is used in Gaussian 03/09 (MP2, MP3, CC or
CI), you must add the keyword ``Density=current`` to the route section in the
Gaussian input file. This is needed to have the corresponding density matrix
in the formatted checkpoint file. When such a post-HF density matrix is
present, HORTON will load that density matrix instead of the SCF density
matrix. Also note that for some levels of theory, no 1RDM is constructed,
including MP4, MP5, ZINDO and QCISD(T).
.. _partition:
Python interface to the partitioning code
=========================================
The ``horton-wpart.py`` script has a rather intuitive Python interface that allows you to
run a more customized analysis. The script ``data/examples/wpart/becke.py`` is a simple
example that runs a Becke partitioning to compute charges and that writes the charges into
a simple text file.
.. literalinclude:: ../data/examples/wpart/becke.py
:caption: data/examples/wpart/becke.py
:lines: 3-20
The unit tests in the source code contain many small examples that can be used
as a starting point for similar scripts. These unit tests can be found in
``horton/part/test/test_wpart.py``.
.. _faq:
Frequently asked questions
==========================
**Which atoms-in-molecules (AIM) partitioning scheme should I use?**
There is no single partitioning scheme that is most suitable for any
application. Nevertheless, some people claim that only one AIM scheme should
be used above all others, especially in the QTAIM and electron density
communities. Try to stay away from such fanboyism, as it has little to do
with good science.
In practice, the choice depends on the purposes you have in mind. You should
try to select a scheme that shows some desirable behavior.
Typically, you would like to have a compromise between some of the following
features:
* Numerical stability
* Robustness with respect to conformational changes
* Uniqueness of the result
* Computational efficiency
* Mathematical elegance
* Simplicity
* Linearity in the density (matrix)
* Chemical transferability
* Applicable to a broad range of systems
* Applicable to periodic/isolated systems
* Applicable to a broad range of electronic structure methods/implementations
* Accuracy of electrostatic interactions with a limited multipole expansion
of the atoms
* Good (empirical) correlations between some AIM property with some experimental
observable
* ...
It goes beyond the scope of this FAQ to describe how each partitioning scheme
(implemented in HORTON or not) performs for these features. Some of these
features are very difficult to assess and, therefore, are subject to intense
debate in the literature.
Regarding our own work, the following papers are directly related to this
question:
* [verstraelen2011a]_ "Assessment of Atomic Charge Models for Gas-Phase Computations on Polypeptides"
* [verstraelen2012a]_ "The conformational sensitivity of iterative stockholder partitioning schemes"
* [verstraelen2013]_ "Hirshfeld-E Partitioning: AIM Charges with an Improved Trade-off between Robustness and Accurate Electrostatics"
* [verstraelen2016]_ "Minimal Basis Iterative Stockholder: Atoms in Molecules for Force-Field Development"
The following are also related to this question, but the list is far from
complete:
* [bultinck2007]_ "Critical analysis and extension of the Hirshfeld atoms in molecules"
* [bultinck2007b]_ "Uniqueness and basis set dependence of iterative Hirshfeld charges"
If you have more suggestions, please drop me a note: Toon.Verstraelen@UGent.be.
**What is the recommended level of theory for computing the pro-atom databses?**
This question is relevant to the following methods implemented in HORTON: Hirshfeld
and Hirshfeld-I, which require a pro-atoms database. They are referred to in this
answer as Hirshfeld-like schemes.
In principle, you are free to choose any level of theory. In
practice, several papers tend to be consistent in the level of theory (and
basis set) that is used for the molecular and pro-atomic computations.
See for example: [bultinck2007]_, [verstraelen2009]_, [verstraelen2011a]_,
[verstraelen2011b]_, [vanduyfhuys2012]_, [verstraelen2012a]_,
[verstraelen2012b]_, [verstraelen2013]_.
One motivation for this consistency is that Hirshfeld-like partitioning
schemes yield atoms-in-molecules that are maximally similar to the
pro-atoms. [nalewajski2000]_ (Technically speaking, the sum
of the `Kullback-Leibler divergence
`_
between the atomic and pro-atomic densities for all the atoms is minimized.)
This principle is (or should be) one of the reasons that Hirshfeld-like
charges are somewhat transferable between chemically similar atoms. You
could hope that the Kullback-Leibler divergence is easier to minimize when
the molecular and pro-atomic densities are computed as consistently as
possible. Thus, using the same level of theory and basis set is advised.
Another motivation is that such consistency may (to some degree) result in,
cancellation of errors, when comparing Hirshfeld-like charges computed at different
levels of theory. For example, it is found that Hirshfeld-I charges have
only a small basis set dependency based on [bultinck2007b]_. In this work, the
molecular and pro-atomic densities were computed consistently.
At the end, it can also be argued that without consistency in the level of
theory, there are so many possible combinations that it becomes impossible
to make a well-motivated choice. Then again, this is a problem that
computational chemists usually embrace with open arms hoping that they
will find a unique combination of levels of theory that turns out to be
useful.
**How can one generate Cube files with CP2K?**
Electron density cube files can be generated with CP2K by adding the following section to the CP2K input file: `E_DENSITY_CUBE `_.