.. DOCUMENTATION BUILT FROM RELEASE: 2.1.1-1-g3a5adfea (Jan 18, 2019) .. : HORTON: Helpful Open-source Research TOol for N-fermion systems. : Copyright (C) 2011-2019 The HORTON Development Team : : This file is part of HORTON. : : HORTON is free software; you can redistribute it and/or : modify it under the terms of the GNU General Public License : as published by the Free Software Foundation; either version 3 : of the License, or (at your option) any later version. : : HORTON is distributed in the hope that it will be useful, : but WITHOUT ANY WARRANTY; without even the implied warranty of : MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the : GNU General Public License for more details. : : You should have received a copy of the GNU General Public License : along with this program; if not, see : : -- Atoms-in-Molecules (AIM) analysis ################################# Introduction ============ HORTON supports several real-space atoms-in-molecules (AIM) partitioning schemes. For this purpose, a wavefunction or an electron density on a grid can be loaded from several file formats. An overview of all supported file formats can be found here: :ref:`ref_file_formats`. For a given wavefunction or cube file, several AIM partitioning schemes can be used to derive AIM observables, e.g. atomic charge, atomic multipole expansion, etc. An overview of features is given in the following table. ================================== ============ Feature References ================================== ============ **Boundary conditions** 0D 1D 2D 3D **AIM schemes** Becke [becke1988_multicenter]_ Hirshfeld [hirshfeld1977]_ Iterative Hirshfeld [bultinck2007]_ Iterative Stockholder [lillestolen2008]_ Minimal Basis Iterative Stockolder [verstraelen2016]_ **AIM properties** Charges Spin charges Cartesian multipoles Pure/harmonic multipoles Radial moments Dispersion coefficients [tkatchenko2009]_ Spherical decomposition ESP due to each AIM [becke1988_poisson]_ ESP due to pro-atom [becke1988_poisson]_ Wiberg bond order Kohn-Sham response **Extra options** Symmetry analysis ================================== ============ For almost all the Hirshfeld variants, you must first set up a pro-atom database densities, preferably at the same level of theory used for the molecular computation. The ``horton-atomdb.py`` script facilitates setting up such a database. All of the outputs generated by HORTON are written to `HDF5 `_ files (with extension ``.h5``). These are binary (compact) platform-independent files that can be post-processed easily with Python scripts. The ``horton-hdf2csv.py`` script can be used to convert (part of) an HDF5 file into the "comma-separated value" (CSV) format, which is supported by most spreadsheet software. The usage of the three scripts (``horton-atomdb.py``, ``horton-wpart.py``, and ``horton-hdf2csv.py``) will be discussed in :ref:`atomdb`, :ref:`wpart`, and :ref:`hdf2csv`, respectively. All scripts have a ``--help`` flag that prints out their complete list of command-line arguments. The penultimate section, :ref:`partition`, shows how to use the partitioning code through the Python interface. The last section, :ref:`faq`, answers some frequently asked questions regarding AIM partitioning with HORTON. .. _atomdb: ``horton-atomdb.py`` -- Build a pro-atom database ================================================== The ``horton-atomdb.py`` script should be used in three steps to build a pro-atom database: 1) **Generate input files** for isolated atom computations using one of the following programs: Gaussian03/09, Orca, PSI4 or CP2K. The following example generates Gaussian09 input files for hydrogen, carbon, nitrogen and oxygen:: horton-atomdb.py input g09 1,6-8 template.com The ``template.com`` file is used to generate the input files and will be discussed in detail below. A series of directories is created with input files for the atomic computations, like ``001__h_001_q+00``, ``001__h_002_q-01``, ``001__h_003_q-02``, etc. Optional arguments can be used to control the range of cations and anions, the spin multiplicities, etc. Also a ``run_g09.sh`` script is generated to take care of the next step. Run the command below to obtain a complete list of all arguments: .. code-block:: bash horton-atomdb.py input --help 2) **Run the atomic computations** by executing the ``run_PROGRAM.sh`` script. In this case:: ./run_g09.sh Note that the ``run_PROGRAM.sh`` scripts assume a default installation of the corresponding software. In the case of non-standard installations or when special calls or environment variables are needed, you must modify this script first. (It will not be overwritten when ``horton-atomdb.py input ...`` is executed again.) 3) **Convert the output files** of the external programs into a database of spherically averaged pro-atom densities (``atoms.h5``). Just run:: ./horton-atomdb.py convert This script also generates figures of the radial densities and Fukui functions, if ``matplotlib`` is installed. In this step, you may use :ref:`the -- grid option `, although the default setting should be fine for nearly all cases. If you remove some directories for atomic computations before or after executing the ``run_PROGRAM.sh`` script, the corresponding atoms will not be included in the database. Similarly, you may rerun ``horton-atomdb.py input ...`` to generate more input files. In that case ``run_PROGRAM.sh`` will only consider the atomic computations that have not been completed yet. In some cases, the ``run_PROGRAM.sh`` script needs to be customized, e.g. you may want to use ``mpirun`` to run the atomic computations in parallel. When such modifications are made, subsequent runs of ``horton-atomdb.py input ...`` will not overwrite the ``run_PROGRAM.sh`` script. Template files -------------- A template file is simply an input file for an atomic computation, where the distinguishing parameters (element, charge, ...) are replaced with keys that are recognized by the input generator of ``horton-atomdb.py``. These keys are: * ``${element}``: The element symbol of the atom. * ``${number}``: The atomic number of the atom. * ``${charge}``: The charge of the atom (or cation, or anion). * ``${mult}``: The spin multiplicity of the atom. For more advanced cases, you may include (parts of) other files with generic keys, e.g. for basis sets that are different for every element: * ``${file:filename}``: This is replaced by the contents of ``filename.NNN_PPP_MM``, where ``NNN`` is the atomic number, ``PPP`` is the atomic population and ``MM`` is the spin multiplicity. These numbers are left-padded with zeros to fix the the length. If a field in the filename is zero, it is considered as a wild card. For example, you may use ``${file:basis}`` in the template file and store a basis set specification for oxygen in the file ``basis.008_000_00``. The basis contained in this file will be used for all cations, anions, and multiplicities of oxygen, because the zeros in the file name act as a wildcard. * ``${line:filename}``: This is comparable to the previous key, except that all replacements are stored in one file. Each (non-empty) line in this file that starts with ``NNN_PPP_MM`` has a string that will be filled into the field of the file that corresponds to the specified ``NNN_PPP_MM``. None of the keys are mandatory, although ``${element}`` (or ``${number}``), ``${charge}`` and ``${mult}`` must be present to obtain sensible results. Basic template file for Gaussian 03/09 --------------------------------------- This is a simple template file for atomic computations at the HF/3-21G level:: %chk=atom.chk #p HF/3-21G scf(xqc) A random title line ${charge} ${mult} ${element} 0.0 0.0 0.0 Do not forget to include an empty line at the end. Otherwise, Gaussian will complain. The first line ``%chk=atom.chk`` is required to write the atomic wavefunction to a file that is needed in the ``convert`` step of the ``horton-atomdb.py`` script. Advanced template file for Gaussian 03/09 ----------------------------------------- When custom basis sets are specified with the ``Gen`` keyword in Gaussian, you have to use keys that include other files. For a database containing H, C and O, you could use the following template: * ``template.com``:: %chk=atom.chk #p PBE1PBE/Gen scf(xqc) A random title line ${charge} ${mult} ${element} 0.0 0.0 0.0 ${file:basis} * ``basis.001_000_00``:: H 0 6-31G(d,p) **** * ``basis.006_000_00``:: C 0 6-31G(d,p) **** * ``basis.008_000_00``:: O 0 S 6 1.00 8588.5000000 0.00189515 1297.2300000 0.0143859 299.2960000 0.0707320 87.3771000 0.2400010 25.6789000 0.5947970 3.7400400 0.2808020 SP 3 1.00 42.1175000 0.1138890 0.0365114 9.6283700 0.9208110 0.2371530 2.8533200 -0.00327447 0.8197020 SP 1 1.00 0.9056610 1.0000000 1.0000000 SP 1 1.00 0.2556110 1.0000000 1.0000000 SP 1 1.00 0.0845000 1.0000000 1.0000000 D 1 1.00 5.1600000 1.0000000 D 1 1.00 1.2920000 1.0000000 D 1 1.00 0.3225000 1.0000000 F 1 1.00 1.4000000 1.0000000 **** Simple template file for ORCA ----------------------------- The following template file uses the built-in ``cc_pVQZ`` basis set of ORCA:: !HF TightSCF %basis Basis cc_pVQZ end *xyz ${charge} ${mult} ${element} 0.0 0.0 0.0 * Template file for CP2K ---------------------- You must use ``CP2K`` version 2.4-r12857 (or newer). The computation of pro-atoms with ``CP2K`` is more involved because you have to specify the occupation of each subshell. The ``ATOM`` program of ``CP2K`` does not simply follow the Aufbau rule to assign orbital occupations. At this moment, only the computation of atomic densities with contracted basis sets and pseudopotentials are supported. The following example can be used to generate a pro-atom database with the elements, O, Na, Al and Si, using the GTH pseudopotential and the MolOpt basis set. * ``template.inp``:: &GLOBAL PROJECT ATOM PROGRAM_NAME ATOM &END GLOBAL &ATOM ATOMIC_NUMBER ${number} ELECTRON_CONFIGURATION (${mult}) CORE ${line:valence.inc} CORE none MAX_ANGULAR_MOMENTUM 1 &METHOD METHOD_TYPE UKS &XC &XC_FUNCTIONAL PBE &END XC_FUNCTIONAL &END XC &END METHOD &POTENTIAL PSEUDO_TYPE GTH POTENTIAL_FILE_NAME ../../PBE_PSEUDOPOTENTIALS POTENTIAL_NAME ${line:ppot.inc} &END POTENTIAL &PP_BASIS BASIS_SET_FILE_NAME ../../BASIS_MOLOPT BASIS_TYPE CONTRACTED_GTO BASIS_SET DZVP-MOLOPT-SR-GTH &END PP_BASIS &PRINT &BASIS_SET ON &END &ORBITALS ON &END &POTENTIAL ON &END &END &END ATOM * ``ppot.inc``:: 008_000_00 GTH-PBE-q6 011_000_00 GTH-PBE-q9 013_000_00 GTH-PBE-q3 014_000_00 GTH-PBE-q4 * ``valence.inc``:: 008_005_02 1s2 2p1 008_006_03 1s2 2p2 008_007_04 1s2 2p3 008_008_03 1s2 2p4 008_009_02 1s2 2p5 008_010_01 1s2 2p6 011_005_02 1s2 2p1 011_006_03 1s2 2p2 011_007_04 1s2 2p3 011_008_03 1s2 2p4 011_009_02 1s2 2p5 011_010_01 1s2 2p6 011_011_02 1s2 2p6 2s1 011_012_01 1s2 2p6 2s2 011_013_02 1s2 2p6 2s2 3p1 013_011_02 1s1 013_012_01 1s2 013_013_02 1s2 2p1 013_014_03 1s2 2p2 014_011_02 1s1 014_012_01 1s2 014_013_02 1s2 2p1 014_014_03 1s2 2p2 Simple template file for PSI4 ----------------------------- The following template file uses the BLYP functional and the built-in ``cc-pvdz`` basis set of PSI4:: molecule { ${charge} ${mult} ${element} 0.0 0.0 0.0 } set { basis cc-pvdz scf_type df guess sad molden_write true reference uhf } energy('blyp') Note that the flags ``molden_write true`` and ``reference uhf`` are required. The former instructs the SCF program to write the orbitals, which HORTON picks up to compute atomic densities. The latter is needed because, usually, most of the atomic computations are on open-shell systems. Because PSI4 only writes Molden files for SCF computations, it is not possible to prepare atomic densities with other levels of theory with PSI4. .. _wpart: ``horton-wpart.py`` -- AIM analysis based on a wavefunction file ================================================================ The basic usage of ``horton-wpart.py`` is as follows:: horton-wpart.py wfn output.h5[:group] {b,h,hi,is,he} [atoms.h5] This script has four important arguments: 1. The ``wfn`` argument is the wave-function file, like a Gaussian03 or 09 formatted checkpoint file, a Molekel file or a Molden input file. Many programs can write out the wavefunction in the Molden format, albeit with some mistakes. For ORCA, PSI4 and Turbomole, these mistakes are detected and corrected. 2. The output file (HDF5 format) in which the partitioning results are written. Usually, the output file has the ``.h5`` extension. The output file is optionally followed by a colon and a group suffix. This suffix can be used to specify a group in the HDF5 output file where the output is to be stored. 3. The third argument refers to the partitioning scheme: * ``b``: Becke partitioning. [becke1988_multicenter]_ * ``h``: Hirshfeld partitioning. [hirshfeld1977]_ * ``hi``: Iterative Hirshfeld partitioning. [bultinck2007]_ * ``is``: Iterative Stockholder partitioning. [lillestolen2008]_ * ``mbis``: Minimal Basis Iterative Stockholder partitioning. [verstraelen2016]_ 4. The ``atoms.h5`` argument is only needed for the Hirshfeld variants ``h`` and ``hi``, not for ``b``, ``is`` or ``mbis``. The file ``atoms.h5`` contains a database of pro-atoms and can be generated as explained in :ref:`atomdb`. This script computes atomic weight functions, and derives all AIM observables that are implemented for that scheme. These results are stored in the specified HDF5 output file. An elaborate description of all command-line arguments can be obtained from: .. code-block:: bash horton-wpart.py --help The integration grid can be tuned with :ref:`ref_grid_option`. .. note:: When a post-Hartree-Fock (post-HF) level is used in Gaussian 03/09 (MP2, MP3, CC or CI), you must add the keyword ``Density=current`` to the route section in the Gaussian input file. This is needed to have the corresponding density matrix in the formatted checkpoint file. When such a post-HF density matrix is present, HORTON will load that density matrix instead of the SCF density matrix. Also note that for some levels of theory, no 1RDM is constructed, including MP4, MP5, ZINDO and QCISD(T). .. _partition: Python interface to the partitioning code ========================================= The ``horton-wpart.py`` script has a rather intuitive Python interface that allows you to run a more customized analysis. The script ``data/examples/wpart/becke.py`` is a simple example that runs a Becke partitioning to compute charges and that writes the charges into a simple text file. .. literalinclude:: ../data/examples/wpart/becke.py :caption: data/examples/wpart/becke.py :lines: 3-20 The unit tests in the source code contain many small examples that can be used as a starting point for similar scripts. These unit tests can be found in ``horton/part/test/test_wpart.py``. .. _faq: Frequently asked questions ========================== **Which atoms-in-molecules (AIM) partitioning scheme should I use?** There is no single partitioning scheme that is most suitable for any application. Nevertheless, some people claim that only one AIM scheme should be used above all others, especially in the QTAIM and electron density communities. Try to stay away from such fanboyism, as it has little to do with good science. In practice, the choice depends on the purposes you have in mind. You should try to select a scheme that shows some desirable behavior. Typically, you would like to have a compromise between some of the following features: * Numerical stability * Robustness with respect to conformational changes * Uniqueness of the result * Computational efficiency * Mathematical elegance * Simplicity * Linearity in the density (matrix) * Chemical transferability * Applicable to a broad range of systems * Applicable to periodic/isolated systems * Applicable to a broad range of electronic structure methods/implementations * Accuracy of electrostatic interactions with a limited multipole expansion of the atoms * Good (empirical) correlations between some AIM property with some experimental observable * ... It goes beyond the scope of this FAQ to describe how each partitioning scheme (implemented in HORTON or not) performs for these features. Some of these features are very difficult to assess and, therefore, are subject to intense debate in the literature. Regarding our own work, the following papers are directly related to this question: * [verstraelen2011a]_ "Assessment of Atomic Charge Models for Gas-Phase Computations on Polypeptides" * [verstraelen2012a]_ "The conformational sensitivity of iterative stockholder partitioning schemes" * [verstraelen2013]_ "Hirshfeld-E Partitioning: AIM Charges with an Improved Trade-off between Robustness and Accurate Electrostatics" * [verstraelen2016]_ "Minimal Basis Iterative Stockholder: Atoms in Molecules for Force-Field Development" The following are also related to this question, but the list is far from complete: * [bultinck2007]_ "Critical analysis and extension of the Hirshfeld atoms in molecules" * [bultinck2007b]_ "Uniqueness and basis set dependence of iterative Hirshfeld charges" If you have more suggestions, please drop me a note: Toon.Verstraelen@UGent.be. **What is the recommended level of theory for computing the pro-atom databses?** This question is relevant to the following methods implemented in HORTON: Hirshfeld and Hirshfeld-I, which require a pro-atoms database. They are referred to in this answer as Hirshfeld-like schemes. In principle, you are free to choose any level of theory. In practice, several papers tend to be consistent in the level of theory (and basis set) that is used for the molecular and pro-atomic computations. See for example: [bultinck2007]_, [verstraelen2009]_, [verstraelen2011a]_, [verstraelen2011b]_, [vanduyfhuys2012]_, [verstraelen2012a]_, [verstraelen2012b]_, [verstraelen2013]_. One motivation for this consistency is that Hirshfeld-like partitioning schemes yield atoms-in-molecules that are maximally similar to the pro-atoms. [nalewajski2000]_ (Technically speaking, the sum of the `Kullback-Leibler divergence `_ between the atomic and pro-atomic densities for all the atoms is minimized.) This principle is (or should be) one of the reasons that Hirshfeld-like charges are somewhat transferable between chemically similar atoms. You could hope that the Kullback-Leibler divergence is easier to minimize when the molecular and pro-atomic densities are computed as consistently as possible. Thus, using the same level of theory and basis set is advised. Another motivation is that such consistency may (to some degree) result in, cancellation of errors, when comparing Hirshfeld-like charges computed at different levels of theory. For example, it is found that Hirshfeld-I charges have only a small basis set dependency based on [bultinck2007b]_. In this work, the molecular and pro-atomic densities were computed consistently. At the end, it can also be argued that without consistency in the level of theory, there are so many possible combinations that it becomes impossible to make a well-motivated choice. Then again, this is a problem that computational chemists usually embrace with open arms hoping that they will find a unique combination of levels of theory that turns out to be useful. **How can one generate Cube files with CP2K?** Electron density cube files can be generated with CP2K by adding the following section to the CP2K input file: `E_DENSITY_CUBE `_.