.. DOCUMENTATION BUILT FROM RELEASE: 2.0.2 (Jun 30, 2017) .. : HORTON: Helpful Open-source Research TOol for N-fermion systems. : Copyright (C) 2011-2016 The HORTON Development Team : : This file is part of HORTON. : : HORTON is free software; you can redistribute it and/or : modify it under the terms of the GNU General Public License : as published by the Free Software Foundation; either version 3 : of the License, or (at your option) any later version. : : HORTON is distributed in the hope that it will be useful, : but WITHOUT ANY WARRANTY; without even the implied warranty of : MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the : GNU General Public License for more details. : : You should have received a copy of the GNU General Public License : along with this program; if not, see : : -- .. _ref_file_formats: Data file formats (input and output) #################################### .. |br| raw:: html
This section gives an overview of the file formats supported by HORTON. Some formats can be used for input and output, others only for input or for output. The formats are always used in the same way: * To load data from a file, you use the :py:meth:`~horton.io.iodata.IOData.from_file` method of the ``IOData`` class: .. code-block:: python mol = IOData.from_file('example.xyz') The format is recognized through the file extension (or in somecases by a prefix, as indicated in the following sections). The loaded data are accessible as attributes of the ``mol`` object, e.g.: .. code-block:: python print mol.coordinates Each file format has its corresponding set of attributes that are filled with data read from the file. For some formats, the available attributes may also depend on the data available in the file. * To dump data into a file, you create a ``IOData`` instance, assign attributes to this instance and call the :py:meth:`~horton.io.iodata.IOData.to_file` method, e.g.: .. code-block:: python mol = IOData(title='Example') mol.numbers = np.array([10]) mol.coordinates = np.array([[0.0, 0.0, 0.0]]) mol = IOData.to_file('example.xyz') As shown in the above example, there are two ways to set the attributes: (i) by passing them as arguments to the constructor of the ``IOData`` class (first line) or by setting the attributes after creating a ``IOData`` instance (second and third line). Again, the file format is deduced from the file name. If not all required attributes for a given format are set, the ``to_file`` method will raise an ``AtributeError``. The complete list of all possible attributes (the superset for all supported formats) is documented here: :py:class:`horton.io.iodata.IOData`. Note that HORTON's internal format supports all of these and any other attribute that you assign to a ``IOData`` instance. .. _ref_file_formats_geo: Molecular geometry file formats =============================== The ``.xyz`` format ------------------- ======================== ======================================================= Load Yes Dump Yes Recognized by File extension ``.xyz`` Interoperation Nearly all molecular simulation codes and `Open Babel `_ Always **loading** ``title`` ``numbers`` ``coordinates`` Derived when **loading** ``natom`` ``pseudo_numbers`` Required for **dumping** ``numbers`` ``coordinates`` Optional for **dumping** ``title`` ======================== ======================================================= The ``POSCAR`` format --------------------- ======================== ======================================================= Load Yes Dump Yes Recognized by File prefix ``POSCAR`` Interoperation `VASP 5.X `_, `VESTA `_ Always **loading** ``title`` ``numbers`` ``coordinates`` ``cell`` Derived when **loading** ``natom`` ``pseudo_numbers`` Required for **dumping** ``numbers`` ``coordinates`` ``cell`` Optional for **dumping** ``title`` ======================== ======================================================= The ``.cif`` (Crystalographic Information File) format ------------------------------------------------------ ======================== ======================================================= Load Works only for simple files Dump Yes, except for symmetry information Recognized by File extension ``.cif`` Interoperation `CCDC `_, `COD `_, ... Always **loading** ``title`` ``numbers`` ``coordinates`` ``cell`` ``symmetry`` ``links`` Derived when **loading** ``natom`` ``pseudo_numbers`` Required for **dumping** ``numbers`` ``coordinates`` ``cell`` Optional for **dumping** ``title`` ======================== ======================================================= .. _ref_file_formats_cube: Cube file formats ================= The Gaussian ``.cube`` format ----------------------------- ======================== ======================================================= Load Yes Dump Yes Recognized by File extension ``.cube`` Interoperation `Gaussian `_, `CP2K `_, `GPAW `_, `Q-Chem `_`, ... Always **loading** ``title`` ``numbers`` ``pseudo_numbers`` ``coordinates`` ``cell`` ``grid`` ``cube_data`` Derived when **loading** ``natom`` Required for **dumping** ``numbers`` ``coordinates`` ``cell`` ``grid`` ``cube_data`` Optional for **dumping** ``title`` ``pseudo_numbers`` ======================== ======================================================= .. note:: The second column in the geometry specification of the cube file is used for the pseudo-numbers. The VASP ``CHGCAR`` and ``LOCPOT`` formats ------------------------------------------ ======================== ======================================================= Load Yes Dump No Recognized by File prefix ``CHGCAR`` or ``LOCPOT`` Interoperation `VASP 5.X `_, `VESTA `_ Always **loading** ``title`` ``coordinates`` ``numbers`` ``cell`` ``grid`` ``cube_data`` Derived when **loading** ``natom`` ``pseudo_numbers`` ======================== ======================================================= .. note:: Even though the ``CHGCAR`` and ``LOCPOT`` files look very similar, they require different conversions to atomic units. .. _ref_file_formats_wfn: Wavefunction formats (using a Gaussian basis set) ================================================= All wavefunction formats share the following behavior * In case of a restricted wavefunction, only the alpha orbitals are loaded. * In case of an unrestricted wavefunction, both the alpha and beta orbitals are loaded. * Some formats also `load` a ``permutation`` and/or a ``signs`` attribute. These are generated when loading the file, such that appropriate permutations and sign changes can be applied to convert to the proper HORTON conventions for Gaussian basis functions. These conventions are `fixed` in the ``from_file`` method. This allows you to fix also the order of elements in arrays loaded from another file. For example, you can load an ``.fchk`` and a ``.log`` file at the same time: .. code-block:: python mol = IOData.from_file('foo.fchk', 'foo.log') In this case, ``permutation`` is deduced from the file ``foo.fchk`` but is also applied to reorder the matrix elements loaded from ``foo.log``, for the sake of consistency. The Gaussian ``.fchk`` format ----------------------------- ======================== ======================================================= Load Yes Dump No Recognized by File extension ``.fchk`` Interoperation `Gaussian `_ Always **loading** ``title`` ``coordinates`` ``numbers`` ``obasis`` ``exp_alpha`` ``permutation`` |br| ``energy`` ``pseudo_numbers`` ``mulliken_charges`` **loading** if present ``npa_charges`` ``esp_charges`` ``exp_beta`` ``dm_full_mp2`` ``dm_spin_mp2`` |br| ``dm_full_mp3`` ``dm_spin_mp3`` ``dm_full_cc`` ``dm_spin_cc`` ``dm_full_ci`` |br| ``dm_spin_ci`` ``dm_full_scf`` ``dm_spin_scf`` Derived when **loading** ``natom`` ======================== ======================================================= The ``.molden`` format ---------------------- ======================== ======================================================= Load Yes Dump Yes Recognized by File extension ``.molden`` Interoperation `Molpro `_, `Orca `_, `PSI4 `_, `Molden `_ Always **loading** ``coordinates`` ``numbers`` ``obasis`` ``exp_alpha`` ``signs`` **loading** if present ``title`` ``exp_beta`` Derived when **loading** ``natom`` Required for **dumping** ``coordinates`` ``numbers`` ``obasis`` ``exp_alpha`` Optional for **dumping** ``title`` ``exp_beta`` ======================== ======================================================= The ``.mkl`` (Molekel) format ----------------------------- ======================== ======================================================= Load Yes Dump No Recognized by File extension ``.mkl`` Interoperation `Molekel `_, `Orca `_, Always **loading** ``coordinates`` ``numbers`` ``obasis`` ``exp_alpha`` **loading** if present ``exp_beta`` ``signs`` Derived when **loading** ``natom`` ======================== ======================================================= The ``.wfn`` format ------------------- ======================== ======================================================= Load Yes Dump No Recognized by File extension ``.wfn`` Interoperation `GAMESS `_, `Gaussian `_, Always **loading** ``title`` ``coordinates`` ``numbers`` ``obasis`` ``exp_alpha`` **loading** if present ``exp_beta`` Derived when **loading** ``natom`` ======================== ======================================================= .. note :: Only use this format if the program that generated it does not offer any alternatives that HORTON can load. The WFN format has the disadvantage that it cannot represent contractions and therefore expands all orbitals into a decontracted basis. This makes the post-processing less efficient compared to formats that do support contractions of Gaussian functions. .. _ref_file_formats_ham: Hamiltonian file formats ======================== The Molpro 2012 ``FCIDUMP`` format ---------------------------------- ======================== ======================================================= Load Yes Dump Yes Recognized by File name contains ``FCIDUMP`` Interoperation `Molpro `_, `PSI4 `_ Always **loading** ``lf`` ``nelec`` ``ms2`` ``one_mo`` ``two_mo`` ``core_energy`` Required for **dumping** ``one_mo`` ``two_mo`` Optional for **dumping** ``core_energy`` ``nelec`` ``ms`` ======================== ======================================================= The Gaussian ``.log`` file -------------------------- ======================== ======================================================= Load Yes Dump No Recognized by File extension ``.log`` Interoperation `Gaussian `_, **loading** if present ``olp`` ``kin`` ``na`` ``er`` ======================== ======================================================= In order to let Gaussian print out all the matrix elements (Gaussian integrals), the following commands must be used in the Gaussian input file: .. code-block:: text scf(conventional) iop(3/33=5) extralinks=l316 iop(3/27=999) Just keep in mind that this feature in Gaussian only works for a low number of basis functions. The ``FCIDUMP`` files generated with Molpro or PSI4 are more reliable and also have the advantage that all integrals are stored in double precision. .. _ref_file_formats_internal: HORTON's internal file format ============================= The internal HDF5-based format of HORTON is effectively a superset of all formats listed above. Moreover, the user is free to store any additional data not covered by the file formats above. Many (not all) Python data types can dumped into the internal format: * ``int`` * ``float`` * ``str`` * Any NumPy array * Classes in the HORTON library that have a ``to_hdf5`` and ``from_hdf5`` method. For example: ``AtomicGridSpec``, ``BeckeMolGrid``, ``Cell``, ``CubicSpline``, ``ESPCost``, ``GBasis``, ``GOBasis``, ``Symmetry``, ``UniformGrid`` and all classes in the package ``horton.matrix`` * A dictionary with strings as keys and any mixture of the above data types as values. ======================== ======================================================= Load Yes Dump Yes Recognized by File extension ``.h5`` Interoperation Custom scripts. Archiving of data generated with any other code. **loading** when present Any attribute Optional for **dumping** Any attribute with the right type ======================== =======================================================