Monday, June 15, 2015

A brief introduction to basis sets

In order to compute the energy we need to define mathematical functions for the orbitals.  In the case of atoms we can simply use the solutions to the Schrödinger equation for the  $\ce{H}$ atom as a starting point and find the best exponent for each function using the variational principle.

But what functions should we use for molecular orbitals (MOs)?  The wave function of the $\ce{H2+}$ molecule provides a clue (Figure 1)

Figure 1. Schematic representation of the wave function of the $\ce{H2+}$  molecule.

It looks a bit like the sum of two 1$s$ functions centered at each nucleus (A and B),
$${\Psi ^{{\text{H}}_{\text{2}}^ + }} \approx \tfrac{1}{{\sqrt 2 }}\left( {\Psi _{1s}^{{{\text{H}}_{\text{A}}}} + \Psi _{1s}^{{{\text{H}}_{\text{B}}}}} \right)$$
Thus, one way of constructing MOs is as a linear combination of atomic orbitals (the LCAO approximation),
$${\phi _i}(1) = \sum\limits_{\mu  = 1}^K {{C_{\mu i}}{\chi _\mu }(1)} $$
an approximation that becomes better and better as $K$ increases.  Here $\chi _\mu$  is a mathematical function that looks like an AO, and is called a basis function (a collection of basis functions for various atoms is called a basis set), and $C_{\mu i}$ is a number (sometimes called an MO coefficient) that indicates how much basis function $\chi _\mu$ contributes to MO $i$, and is determined for each system via the variational principle.   Note that every MO is expressed in terms of all basis function, and therefore extends over the entire molecule.

If we want to calculate the RHF energy of water, the basis set for the two $\ce{H}$ atoms would simply be the lowest energy solution to the Schrödinger equation for $\ce{H}$ atom
$${\chi _{{{\text{H}}_{\text{A}}}}}(1) = \Psi _{1s}^{\text{H}}({r_{1A}}) = \frac{1}{{\sqrt \pi  }}{e^{ - \left| {{{\bf{r}}_1} - {{\bf{R}}_A}} \right|}}$$
For the O atom, the basis set is the AOs obtained from, say, an ROHF calculation on $\ce{O}$, i.e. $1s$, $2s$, $2p_x$, $2p_y$, and $2p_z$ functions from the solutions to the Schrödinger equation for the $\ce{H}$ atom, where the exponents ($\alpha_i$'s) have been variationally optimized for the $\ce{O}$ atom,
$$\Psi _{1s}^{\text{H}},\;\Psi _{2s}^{\text{H}},\;\Psi _{2p}^{\text{H}} \xrightarrow{\frac{\partial E}{\partial \alpha_i}=0} \phi _{1s}^{\text{O}},\;\phi _{2s}^{\text{O}},\;\phi _{2p}^{\text{O}} \equiv \;\left\{ {{\chi _O}} \right\}$$
Notice that this only has to be done once, i.e. we will use this oxygen basis set for all oxygen-containing molecules.  We then provide a guess at the water structure and the basis functions are placed at the coordinates of the respective atoms.  Then we find the best MO coefficients by variational minimization,
$$\frac{{\partial E}}{{\partial {C_{\mu i}}}} = 0 $$
for all $i$ and $\mu$. Thus, for water we need a total of seven basis functions to describe the five doubly occupied water MOs ($K = 7$ and $i = 1-5$ in Eq 5).  This is an example of a minimal basis set, since it is the minimum number of basis functions per atoms that makes chemical sense.

One problem with the LCAO approximation is the number of 2-electron integrals it leads to, and the associated computational cost.  Let’s look at the part of the energy that comes from the Coulomb integrals
  \sum\limits_{i = 1}^{N/2} {\sum\limits_{j = 1}^{N/2} {2{J_{ij}}} }  &= \sum\limits_{i = 1}^{N/2} {\sum\limits_{j = 1}^{N/2} {2\left\langle {\left. {{\phi _i}(1){\phi _j}(2)} \right|\frac{1}{{{r_{12}}}}\left| {{\phi _i}(1){\phi _j}(2)} \right.} \right\rangle } }  \\
   &= \sum\limits_i^{N/2} {\sum\limits_j^{N/2} {2\left\langle {\left. {{\phi _i}(1){\phi _i}(1)} \right|\frac{1}{{{r_{12}}}}\left| {{\phi _j}(2){\phi _j}(2)} \right.} \right\rangle } }  \\
   &= \sum\limits_i^{N/2} {\sum\limits_j^{N/2} {2\left\langle {{{\phi _i}{\phi _i}}}
 \mathrel{\left | {\vphantom {{{\phi _i}{\phi _i}} {{\phi _j}{\phi _j}}}}
 \right. }
 {{{\phi _j}{\phi _j}}} \right\rangle } }  \\
   &= \sum\limits_\mu ^K {\sum\limits_\nu ^K {\sum\limits_\lambda ^K {\sum\limits_\sigma ^K {\sum\limits_i^{N/2} {\sum\limits_j^{N/2} {2{C_{\mu i}}{C_{\nu i}}{C_{\lambda j}}{C_{\lambda j}}\left\langle {{{\chi _\mu }{\chi _\nu }}}
 \mathrel{\left | {\vphantom {{{\chi _\mu }{\chi _\nu }} {{\chi _\lambda }{\chi _\sigma }}}}
 \right. }
 {{{\chi _\lambda }{\chi _\sigma }}} \right\rangle } } } } } }  \\
   &= \sum\limits_\mu ^K {\sum\limits_\nu ^K {\sum\limits_\lambda ^K {\sum\limits_\sigma ^K {\tfrac{1}{2}{P_{\mu \nu }}{P_{\lambda \sigma }}\left\langle {{{\chi _\mu }{\chi _\nu }}}
 \mathrel{\left | {\vphantom {{{\chi _\mu }{\chi _\nu }} {{\chi _\lambda }{\chi _\sigma }}}}
 \right. }
 {{{\chi _\lambda }{\chi _\sigma }}} \right\rangle } } } }
We have roughly $(N/2)^2$ Coulomb integrals involving molecular orbitals but roughly $1/8K^4$ Coulomb integrals involving basis functions (the factor of 1/8 comes from the fact that some the integrals are identical and need only be computed once).  Using a minimal basis set $K$ = 80 for a small organic molecule like caffeine $\ce{(C8H10N4O2)}$ and results in ca 5,000,000 2-electron integrals involving basis functions! (That being said, you can perform an RHF energy calculation with 80 basis functions on your desktop computer in a few minutes.  The problem with $K^4$-scaling is that a corresponding calculation with 800 basis functions would take a few days on the same machine.  So you can forget about optimizing the structure on that machine.)  This is one of the reasons why modern computational quantum chemistry requires massive computers.  This is also the reason why the basis set size it a key consideration in a quantum chemistry project.

The 2-electron integrals also pose another problem: the basis functions defined so far are exponential functions (also known as Slater type orbitals or STOs).  2-electron integrals involving STOs placed on four different atoms do not have analytic solutions.  As a result, most quantum chemistry programs use Gaussian type orbitals (or simply Gaussians) instead of STOs, because the 2-electron integrals involving Gaussians have analytic solutions.  Obviously,
$${e^{ - \alpha {r_{1A}}}} \approx {e^{ - \beta r_{1A}^2}}$$
is a poor approximation, so a linear combination of Gaussians are used to model each STO basis function (Figure 2)
$${e^{ - \alpha {r_{1A}}}} \approx \sum\limits_i^X {{a_{i\mu }}{e^{ - {\beta _i}r_{1A}^2}}}  = {\chi _\mu }$$
Figure 2. (a) An exponential function is not well represented by one Gaussian, but (b) can be well represented by a linear combination of three Gaussians.

Here the $a_{i\mu}$parameters (or contraction coefficients) as well as the Gaussian exponents are determined just once for a given STO basis function. $\chi_\mu$  is a contracted basis function and the $X$ individual Gaussian functions are called primitives.  Generally, three primitives are sufficient to represent an STO, and this basis set is known at the STO-3G basis set.  $p$- and $d$-type STOs are expanded in terms of $p$- and $d$-type primitive Gaussians [e.g. $({x_1} - {x_A}){e^{ - \beta r_{1A}^2}}$ and $({x_1} - {x_A})({y_1} - {y_A}){e^{ - \beta r_{1A}^2}}$].  An RHF calculation using the STO-3G basis set is denoted RHF/STO-3G.  Unless otherwise noted, this usually also implies that the geometry is computed (i.e. the minimum energy structure is found) at this level of theory.

Minimal basis sets are usually not sufficiently accurate to model reaction energies.  This is due to the fact that the atomic basis functions cannot change size to adjust to their bonding environment. However, this can be made possible by using some the contraction coefficients as variational parameters.  This will increase the basis set size (and hence the computational cost) so this must be done judiciously.  For example, we’ll get most improvement by worrying about the basis functions that describe the valence electrons that participate most in bonding.  Thus, for $\ce{O}$ atom we leave the 1$s$ core basis function alone, but “split” the valence 2$s$ basis function into linear combinations of two and one Gaussians respectively,
   {\chi _{1s}} &= \sum\limits_i^3 {{a_{i1s}}{e^{ - {\beta _i}r_{1A}^2}}}   \\
   {\chi _{2{s_a}}} &= \sum\limits_i^2 {{a_{i2s}}{e^{ - {\beta _i}r_{1A}^2}}}   \\
  {\chi _{2{s_b}}} &= {e^{ - {\beta _{2{s_b}}}r_{1A}^2}} \\
\end{split} $$
and similarly for the 2$p$ basis functions. This is known as the 3-21G basis set (pronounced “three-two-one g” not “three-twenty one g”), which denotes that core basis functions are described by 3 contracted Gaussians, while the valence basis functions are split into two basis functions, described by 2 and 1 Gaussian each.  Thus, using the 3-21G basis set to describe water requires 13 basis functions: two basis functions on each $\ce{H}$ atom (1$s$ is the valence basis function of the H atom) and 9 basis functions on the $\ce{O}$ atom (one 1$s$ function and two each of 2$s$, 2$p_x$, 2$p_y$, and 2$p_z$).

The $\chi _{2{s_a}}$  basis function is smaller (i.e., the Gaussians have a larger exponent) than the   basis function.  Thus, one can make a function of any intermediate size by (variationally) mixing these two functions (Figure 3).  The 3-21G is an example of a split valence or double zeta basis set (zeta, ζ, is often used as the symbol for the exponent, but I find it hard to write and don’t use it in my lectures).   Similarly, one can make other double zeta basis sets such as 6-31G, or triple zeta basis sets such as 6-311G.

Figure 3. Sketch of two different sized $s$-type basis functions that can be used to make a basis function of intermediate size

As the number of basis functions ($K$ in Eq 2) increase the error associated with the LCAO approximation should decrease and the energy should converge to what is called the Hartree-Fock limit ($E_{\text{HF}}$) that is higher than the exact energy ($E_{\text{exact}}$) (Figure 4).  The difference is known as the correlation energy,
and is the error introduced by the orbital approximation
$$\Psi ({{\bf{r}}_1},{{\bf{r}}_2}, \ldots {{\bf{r}}_N}) \approx \left| {{\phi _1}(1){{\bar \phi }_1}(2) \ldots {\phi _{N/2}}(N - 1){{\bar \phi }_{N/2}}(N)} \right\rangle $$

Figure 4. Plot of the energy as a function the number of basis functions.

However, in the case of a one-electron molecule like $\ce{H2+}$ we would expect the energy to converge to $E_{\text{exact}}$ since there is no orbital approximation.  However, if we try this with the basis sets discussed thus far we find that this is not the case (Figure 5)!

Figure 5. Plot of the energy of $\ce{H2+}$ computed using increasingly larger basis sets.

What’s going on? Again we get a clue by comparing the exact wave function to the LCAO-wave function (Figure 6).

Figure 6. Comparison of the exact wave function and one computed using the 6-311G basis set.

We find that compared to the exact result there is not “enough wave function” between the nuclei and too much at either end.  As we increase the basis set we only add $s$-type basis functions (of varying size) to the basis set.  Since they are spherical they cannot be used to shift electron from one side of the $\ce{H}$ atom to the other.  However, $p$-functions are perfect for this (Figure 7).

Figure 7. Sketch of the polarization of an s basis function by a p basis function

So basis set convergence is not a matter of simply increasing the number of basis functions, it is also important to have the right mix of basis function types.  Similarly, $d$-functions can be used to “bend” $p$-functions (Figure 8).

Figure 8. Sketch of the polarization of a p basis function by a d basis function

Such functions are known as polarization functions, and are denoted with the following notation. For example, 6-31G(d) denotes d polarization functions on all non-$\ce{H}$ atoms and can also be written as 6-31G*.  6-31G(d,p) is a 6-31G(d) basis set where p-functions have been added on all $\ce{H}$ atoms, and can also be written 6-31G**.  A RHF/6-31G(d,p) calculation on water involves 24 basis functions: 13 basis functions for the 6-31G part (just like for 3-21G) plus 3 $p$-type polarization functions on each H atom and 5 $d$-type polarization functions (some programs use 6 Cartesian d-functions instead of the usual 5).

Anions tend to have very diffuse electron distributions and very large basis functions (with very small exponents) are often needed for accurate results.  These diffuse functions are denoted with “+” signs: e.g. 6-31+G denotes one s-type and three $p$-type diffuse Gaussians on each non-$\ce{H}$ atom, and 6-31++G denotes the addition of a single diffuse $s$-type Gaussian on each $\ce{H}$-atom. Diffuse functions also tend to improve the accuracy of calculations on van der Waals complexes and other structures where the accurate representation of the outer part of the electron distribution is important.

Of course there are many other basis sets available, but in general they have the same kinds of attributes as described already.  For example, aug-cc-pVTZ is a more modern basis set: “aug” stands for “augmented” meaning “augmented with diffuse functions”, “pVTZ” means “polarized valence triple zeta”, i.e. it is of roughly the same quality as 6-311++G(d,p).  “cc” stands for “correlation consistent” meaning the parameters were optimized for correlated wave functions (like MP2, see below) rather than HF wave function like Pople basis sets [such as 6-31G(d)] described thus far.

Related blog posts
Complete basis set extrapolation

Monday, June 1, 2015

Computational Chemistry Highlights: May issue

The May issue of Computational Chemistry Highlights is out.

CCH is an overlay journal that identifies the most important papers in computational and theoretical chemistry published in the last 1-2 years. CCH is not affiliated with any publisher: it is a free resource run by scientists for scientists. You can read more about it here.

Table of content for this issue features contributions from CCH editors Steven Bachrach and Jan Jensen:

Exploring the Accuracy Limits of Local Pair Natural Orbital Coupled-Cluster Theory

Uthrene, a radically new molecule?

Interested in more? There are many ways to subscribe to CCH updates.

Also, for your daily computational chemistry fix subscribe to Computational Chemistry Daily

This work is licensed under a Creative Commons Attribution 4.0

Sunday, May 24, 2015

Energy Minimization: Finding and Connecting Stationary Points

Let’s start by considering a simple example with one coordinate (for example a distance), $R$, and where the energy is given by
$$ E= \frac{1}{2}(R-R_e)^2$$
The corresponding potential energy surface, known as a quadratic PES, is shown in Figure 1.

Figure 1. Plot of the (quadratic) potential energy surface given by Equation 1.

Here $R_e$ is the value of $R$ at which the energy is lowest (this is known as the equilibrium geometry) and this is what we’d like to find.  We start by taking a guess at $R$, $R_g$.  We already know how to check whether this is an energy minimum: we need to evaluate the gradient, which is
$$\left(\frac{\partial E}{\partial R}\right)_{R_g}=k(R_g-R_e)$$
It’s clear that the gradient is non-zero for $R_e \ne R_g$.  However, by rearranging the equation the gradient also tells us how to change $R$ to get an energy minimum
$$R_e=R_g-\frac{1}{k}\left(\frac{\partial E}{\partial R}\right)_{R_g}=R_g+\frac{1}{k}F_g$$
where $F$ is the force, which is the negative gradient.  If we know $k$ we can find $R_e$ in one step starting from $R_g$ (Figure 2).

Figure 2. The equilibrium geometry can be found in one step on a quadratic PES

If we don’t know $k$ then it is safest to take many small steps, i.e. scale the gradient by some small constant ($c$) and repeat until the gradient falls below some threshold (Figure 3)
$$R_{n+1}=R_n-c\left(\frac{\partial E}{\partial R}\right)_{R_n}$$

Figure 3. Energy minimization by steepest descent.

This general approach is known as steepest descent, since the gradient tells you in which direction where the energy is decreasing (descending) the most. Notice that this means that minimizing the energy will find the closest minimum to the starting geometry.  When steepest descent is used to find equilibrium geometries it is often combined with a so-called “line search”, which tries to find the lowest energy in the direction of the current gradient, but the general idea is still the same.

Another use of steepest descent is to connect a transitions state with its two closest minima, since this tells you what two minima the transition state connects.  Here the guess structure is a transition state structure displaced along the normal mode corresponding to the imaginary frequency (the transition state structure itself cannot be used because its gradient is zero).  This path is the minimum energy path (MEP) between reactions and products and the resulting collection of structures is known as the intrinsic reaction coordinate (IRC).  An IRC is usually depicted as a plot of the potential energy vs the mass-weighted root-mean-square-displacement of the Cartesian coordinates relative to some reference geometry (usually the transition state), usually denoted $s$ (Figure 4).  When we draw a potential energy surface for a reaction we usually mean an IRC.

Figure 4. Depiction of an IRC or MEP: two steepest descent paths starting from the transition state

Clearly, the more steps we take, the higher the computational cost, and in general we want to take as few steps as possible.  As I mentioned above, if we knew $k$ we could find the energy minimum in one step.  For a quadratic surface $k$ is the second derivative of $E$ with respect to $R$ (or the first derivative of the gradient) so Eq 3 becomes
$$ R_e=R_g-\left(\frac{\partial^2 E}{\partial R^2}\right)^{-1}_{R_g}\left(\frac{\partial E}{\partial R}\right)_{R_g}$$
Such a step is known as a Newton-Raphson step or quadratic step.  This works fine if the surface is quadratic, which is a good approximation near the minimum but not far away.  For really bad guesses, where the PES tends to be flat, the quadratic step can be too large (Figure 5).

Figure 5. Quadratic steps (a) close to and (b) far from the minimum.

Most algorithms will scale back quadratic steps that are considered unreasonably large, and even using quadratic energy minimizers many steps are needed:
$$R_{n+1}=R_n-\left(\frac{\partial^2 E}{\partial R^2}\right)^{-1}_{R_n}\left(\frac{\partial E}{\partial R}\right)_{R_n}$$
$$ \mathbf{q}_{n+1}=\mathbf{q}_{n}-c\mathbf{H}^{-1}_n\mathbf{g}_n$$
Eq. 7 is the same equation as 6 but in many dimensions: qn are the current coordinates (for example Cartesian coordinates) for step number $n$ ($n = 0$ for the guess coordinates), $\mathbf{q}_{n+1}$ are the new coordinates, $\mathbf{g}_n$ is the gradient, and $\mathbf{H}_n$ is the Hessian both evaluated at the current coordinates. Note that the Hessian should be computed at each step.

Friday, May 1, 2015

Computational Chemistry Highlights: April issue

The April issue of Computational Chemistry Highlights is out.

CCH is an overlay journal that identifies the most important papers in computational and theoretical chemistry published in the last 1-2 years. CCH is not affiliated with any publisher: it is a free resource run by scientists for scientists. You can read more about it here.

Table of content for this issue features contributions from CCH editors Mario Barbatti, Steven Bachrach, and Jan Jensen:

Big Data Meets Quantum Chemistry Approximations: The ∆-Machine Learning Approach

Wednesday, April 1, 2015

Sunday, March 1, 2015

Computational Chemistry Highlights: February issue

The February issue of Computational Chemistry Highlights is out.

CCH is an overlay journal that identifies the most important papers in computational and theoretical chemistry published in the last 1-2 years. CCH is not affiliated with any publisher: it is a free resource run by scientists for scientists. You can read more about it here.

Table of content for this issue features contributions from CCH editors Dean Tantillo, Marcel Swart, Martin Korth, Johannes Hachmann, Steven Bachrach, and Jan Jensen:

Water–Water and Water–Solute Interactions in Microsolvated Organic Complexes

Sunday, February 1, 2015

Computational Chemistry Highlights: January issue

The January issue of Computational Chemistry Highlights is out.

CCH is an overlay journal that identifies the most important papers in computational and theoretical chemistry published in the last 1-2 years. CCH is not affiliated with any publisher: it is a free resource run by scientists for scientists. You can read more about it here.

Table of content for this issue features contributions from CCH editors Alán Aspuru-Guzik, Andres Cisneros, Steven Bachrach, David Bowler, and Jan Jensen:

Construction of a Highly Distorted Benzene Ring in a Double Helicene

A computational drug design success story

Orbital-free density functional theory implementation with the projector augmented-wave method

Enhanced Helical Folding of ortho-Phenylenes through the Control of Aromatic Stacking Interactions