Skip to content
Snippets Groups Projects
handson.ipynb 329 KiB
Newer Older
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "![Logo](figures/GOE_Logo_Quer_IPC_Farbe_RGB.png)\n",
    "# High-Dimensional Neural Network Potentials\n",
    "**[Alexander L. M. Knoll](mailto:aknoll@chemie.uni-goettingen.de), and [Moritz R. Schäfer](mailto:moritzrichard.schaefer@uni-goettingen.de)** \n",
    "\n",
    "Behler Group, Theoretische Chemie, Institut für Physikalische Chemie\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "For this tutorial it is intended to use the RuNNer release version 1.2, available in conda-forge. \n",
    "The most recent version of RuNNer is [hosted on Gitlab](https://gitlab.com/TheochemGoettingen/RuNNer). For access please contact Prof. Dr. Jörg Behler (joerg.behler@uni-goettingen.de)."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/kalm/.pyenv/versions/pyiron/lib/python3.10/site-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated\n",
      "  \"class\": algorithms.Blowfish,\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import ipympl\n",
    "import ipywidgets as widgets\n",
    "\n",
    "from pyiron_atomistics import Project\n",
    "from pyiron_contrib.atomistics.runner.job import RunnerFit\n",
    "from pyiron_contrib.atomistics.runner.utils import container_to_ase\n",
    "\n",
    "from ase.geometry import get_distances\n",
    "\n",
    "from runnerase import generate_symmetryfunctions\n",
    "\n",
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Background\n",
    "### Architecture of an HDNNP"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "**RuNNer** is a stand-alone Fortran program for the construction of high-dimensional neural network potentials (HDNNPs), written mainly by Jörg Behler. The central assumption made in constructing a HDNNP is that the total energy of the system $E_{\\mathrm{tot}}$ [can be separated into atomic contributions $E_i$](https://www.doi.org/10.1103/PhysRevLett.98.146401). HDNNP relates the local environment of the atoms to their atomic energies $E_i$, which contribute to the sum of all $N$ atomic energies, resulting in the total energy of the system $E_\\mathrm{tot}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "\\begin{align}\n",
    "E_\\mathrm{tot} = \\sum_{i}^{N}E_i\\notag\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "Every atomic energy is described by an atomic neural network (NN), which is element-specific. The entirety of all atomic NNs composes a HDNNP, whose general architecture is shown below for a binary system."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "<img src=\"figures/2g.png\" class=\"center\" width=\"500\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "As you can see, the atomic contributions in this model are predicted independently from each other. Therefore, the model can easily describe systems with differering numbers of atoms: adding or removing an atom corresponds to adding or removing a row in the figure shown above. This ability is what puts the \"high-dimensional\" into the name \"HDNNP\". \n",
    "\n",
    "Each atomic neural networks receives input information about the local atomic environment up to a certain cutoff radius $R_{\\mathrm{c}}$. This information is encoded based on the Cartesian coordinates in many-body descriptors, so-called [atom-centered symmetry functions (ACSF or just SF)](https://www.doi.org/10.1063/1.3553717). More details about this are shown below. For each atom, the values of multiple SFs compose a SF vector $G$ which is the input layer of the atomic NNs.\n",
    "\n",
    "Atomic NNs look like this:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "<img src=\"figures/ann.png\" width=\"500\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "Every value in the SF vector $G$ serves as one piece of input information to the atomic NN. We refer to the circles in the figure as _nodes_ (from graph theory) or _neurons_ (from neural science). The information from the input nodes flows through the atomic NN from left to right: the input layer is followed by a configurable number of hidden layers which consist, in turn, of an arbitrary number of _hidden nodes_. At the end, all information is collected in the output layer, which in our case is interpreted as the atomic energy contribution of the atom under consideration. The input nodes and the hidden nodes in the first layer are connected by weights. Moreover, the hidden and output nodes carry a bias value.\n",
    "\n",
    "During training, the weights and biases are optimized using backpropagation to represent best the data in the training data set."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "SFs provide the input for the NN and describe the local atomic environment of each atom. In principle, one could also use Cartesian coordinates to capture the atomic positions in a structure. As Cartesian coordinates describe the absolute positions of atoms the numerical input to the atomic NNs would change with translation or rotation of the system. However, these actions do not influence the energy of the system and different numerical inputs belonging to the same NN output lead to large training errors.\n",
    "\n",
    "In contrast, SFs describe the relative positions of the atoms to each other and are hence translationally and rotationally invariant. We differentiate two types of SFs: radial SF depend on the distance between atom pairs and serve as a measure for their bond order. Angular SFs additionally depend on the interatomic angles."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "The cutoff function $f_{\\mathrm{c}}$ ensures that only the neighbors within one atomic environment counts towards the symmetry function values. The cutoff radius $R_\\mathrm{c}$ (usually $12\\,\\mathrm{bohr}$) defines how much of the local atomic environment is considered. All SFs and their derivatives will decrease to zero if the pairwise distance is larger than $R_\\mathrm{c}$. There are several cutoff funtions defined in **RuNNer** and we will use here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "    f_{c}(R_{ij}) = \n",
    "    \\begin{cases}\n",
    "    0.5 \\cdot [\\cos(\\pi x) + 1]& ~ \\text{for $R_{ij} \\leq R_\\mathrm{c}$},\\\\\n",
    "    0& ~ \\text{for $R_{ij} > R_\\mathrm{c}$}\n",
    "    \\end{cases}\n",
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "with the atomic distance $R_{ij}$, the cutoff radius $R_\\mathrm{c}$, and $x = \\frac{R_{ij}}{R_{\\mathrm{c}}}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Take a look at the figure below for a graphical representation of the cutoff radius in a periodic system: the red atom is the central atom for which the SF values will be calculated. All yellow atoms lie within in the cutoff radius and will therefore contribute to the SF values."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "<img src=\"figures/Rc.png\" width=\"500\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "To define the parameters for the radial SFs, it is important to know the shortest bond distance for each element combination in your data set. Usually, 5-6 radial SF are used for any element pair, with different $\\eta$ values to increase the resolution for structure description. It is possible to shift the maximum of the radial SF $G^2$ by $R_{s}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "    G_{i}^{2} = \\sum_{j}^{}e^{-\\eta (R_{ij} - R_{s})^2} \\cdot f_{c}(R_{ij}).\n",
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "In most applications, the Gaussian exponents $\\eta$ for the radial SFs are chosen such that the SF turning points are equally distributed between the cutoff radius and specific minimum pairwise distance in the training dataset (small eta $\\eta$ = max. contraction). In RuNNer, you can either define element pair specific SF or define global SF which are used for every element combination. It is also possible to define different cutoff radii for the SF, even though this is rarely helpful and therefore not recommended.\n",
    "\n",
    "Below, you can see a graphical representation of the radial symmetry functions including the cutoff function for a cutoff radius of 12 Bohr."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"figures/radials.png\" width=\"800\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "The same rules apply to the angular SFs. Here, however, three atomic positions are included in the calculation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "    G_{i}^{3} = 2^{\\zeta - 1}\\sum_{j}^{} \\sum_{k}^{} \\left[( 1 + \\lambda \\cdot cos \\theta_{ijk})^{\\zeta} \\cdot e^{-\\eta (R_{ij}^2 + R_{ik}^2 + R_{jk}^2)} \\cdot f_{\\mathrm{c}}(R_{ij}) \\cdot f_{\\mathrm{c}}(R_{ik}) \\cdot f_{\\mathrm{c}}(R_{jk}) \\right]\n",
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "The angle $\\theta_{ijk} = \\frac{\\mathbf{R}_{ij} \\cdot \\mathbf{R}_{ik}}{R_{ij} \\cdot R_{ik}}$ is centered at atom $i$. For most system, we use permutations of $\\zeta = \\{1, 2, 4, 16\\}$, $\\eta = 0$, and $\\lambda$ = $\\{+1, -1\\}$. If many atoms of each element are present, angular SFs are usually not critical and a default set of SFs can be used."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"figures/angulars.png\" width=\"800\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "Due to time limitations, we will only focus on the model described above which is known as the second generation of HDNNPs (see [here](https://www.doi.org/10.1103/PhysRevLett.98.146401), and [here](https://www.doi.org/10.1002/anie.201703114), and [here](https://www.doi.org/10.1002/qua.24890)). However, in recent years third- and fourth-generation HDNNPs were developed by the Behler group."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"figures/2g.png\" width=\"500\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    },
    "tags": []
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "In second-generation HDNNPs only atomic interactions inside the cutoff sphere are taken into account. The resulting short-ranged potentials are well-suited to describe local bonding even for complex atomic environments. However, it can be expected that for many systems long-range interactions, primarily electrostatics, will be important.\n",
    "\n",
    "To overcome those limitations, third-generation NNs (see [here](https://www.doi.org/10.1103/PhysRevB.83.153101), and [here](https://www.doi.org/10.1063/1.3682557)) define a second set of atomic neural networks to construct environment-dependent atomic charges. They can then be used to calculate the long-range electrostatic energy without truncation. The total energy of the system is then given by the sum of the short-range and the electrostatic energies."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "<img src=\"figures/3g.png\" width=\"800\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "While the use of environment-dependent charges is a clear step forward, they are not sufficient if long-range charge transfer is present.\n",
    "\n",
    "When dealing with systems where long-range charge transer is present, the usage of fourth-generation NNs (see [here](https://www.doi.org/10.1038/s41467-020-20427-2), and [here](https://www.doi.org/10.1021/acs.accounts.0c00689)) is recommended. Here, environment-dependent electronegativities $\\chi$ are computed first. They will then be used in a charge equilibration scheme to determine the atomic charges $Q$. Again, we can compute the electrostatic energy from this. Moreover, the atomic charges serve as an additional input neuron to train the short-range energy and forces. As it was the case for the third-generation NNs the total energy is then given by the sum of the short-range and the electrostatic energies."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "<img src=\"figures/4g.png\" width=\"800\"/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, training a HDNNP with **RuNNer** can be separated into three different stages - so-called modes - in which different types of calculation are performed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- **Mode 1:** calculation of the SF values and separation of the dataset into a training and testing set.\n",
    "- **Mode 2:** training of the model to construct the HDNNP.\n",
    "- **Mode 3:** prediction of energy and forces (stress and charges can also be predicted)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "All these steps are performed consecutively beginning with mode 1."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "The creation of a meaningful neural network potential lives and dies with high quality training data. Therefore, we will begin by inspecting the full training dataset. \n",
    "The dataset has been stored prior to the workshop in form of a `TrainingContainer`. \n",
    "In pyiron, `TrainingContainer`s are jobs which take a set of structures and properties like energies, forces, ... and store them in HDF format.\n",
    "\n",
    "Go ahead and open up the project:"
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "outputs": [],
   "source": [
    "pr = Project('../../introduction/training')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "The project already contains several jobs:"
   ]
  },
  {
   "cell_type": "code",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>status</th>\n",
       "      <th>chemicalformula</th>\n",
       "      <th>job</th>\n",
       "      <th>subjob</th>\n",
       "      <th>projectpath</th>\n",
       "      <th>project</th>\n",
       "      <th>timestart</th>\n",
       "      <th>timestop</th>\n",
       "      <th>totalcputime</th>\n",
       "      <th>computer</th>\n",
       "      <th>hamilton</th>\n",
       "      <th>hamversion</th>\n",
       "      <th>parentid</th>\n",
       "      <th>masterid</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>212</td>\n",
       "      <td>finished</td>\n",
       "      <td>None</td>\n",
       "      <td>full</td>\n",
       "      <td>/full</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/</td>\n",
       "      <td>2022-06-03 00:02:34.816533</td>\n",
       "      <td>NaT</td>\n",
       "      <td>NaN</td>\n",
       "      <td>zora@cmti001#1</td>\n",
       "      <td>TrainingContainer</td>\n",
       "      <td>0.4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>213</td>\n",
       "      <td>finished</td>\n",
       "      <td>None</td>\n",
       "      <td>basic</td>\n",
       "      <td>/basic</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/</td>\n",
       "      <td>2022-06-03 00:02:47.281055</td>\n",
       "      <td>NaT</td>\n",
       "      <td>NaN</td>\n",
       "      <td>zora@cmti001#1</td>\n",
       "      <td>TrainingContainer</td>\n",
       "      <td>0.4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>214</td>\n",
       "      <td>finished</td>\n",
       "      <td>None</td>\n",
       "      <td>Al_basic_atomicrex</td>\n",
       "      <td>/Al_basic_atomicrex</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/</td>\n",
       "      <td>2022-06-03 02:00:15.887059</td>\n",
       "      <td>NaT</td>\n",
       "      <td>NaN</td>\n",
       "      <td>zora@cmti001#1</td>\n",
       "      <td>TrainingContainer</td>\n",
       "      <td>0.4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>226</td>\n",
       "      <td>finished</td>\n",
       "      <td>None</td>\n",
       "      <td>data_lithium</td>\n",
       "      <td>/data_lithium</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/</td>\n",
       "      <td>2022-07-21 08:18:19.193090</td>\n",
       "      <td>NaT</td>\n",
       "      <td>NaN</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>TrainingContainer</td>\n",
       "      <td>0.4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>227</td>\n",
       "      <td>finished</td>\n",
       "      <td>None</td>\n",
       "      <td>fit_mode1</td>\n",
       "      <td>/fit_mode1</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/fit_lithium/</td>\n",
       "      <td>2022-07-21 08:18:22.251691</td>\n",
       "      <td>2022-07-21 08:18:28.328624</td>\n",
       "      <td>6.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>RunnerFit</td>\n",
       "      <td>0.4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>228</td>\n",
       "      <td>finished</td>\n",
       "      <td>None</td>\n",
       "      <td>fit_mode2</td>\n",
       "      <td>/fit_mode2</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/fit_lithium/</td>\n",
       "      <td>2022-07-21 08:18:30.158483</td>\n",
       "      <td>2022-07-21 08:22:02.862973</td>\n",
       "      <td>212.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>RunnerFit</td>\n",
       "      <td>0.4</td>\n",
       "      <td>227.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>229</td>\n",
       "      <td>finished</td>\n",
       "      <td>None</td>\n",
       "      <td>fit_mode3</td>\n",
       "      <td>/fit_mode3</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/fit_lithium/</td>\n",
       "      <td>2022-07-21 08:22:04.136475</td>\n",
       "      <td>2022-07-21 08:22:11.189763</td>\n",
       "      <td>7.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>RunnerFit</td>\n",
       "      <td>0.4</td>\n",
       "      <td>228.0</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>230</td>\n",
       "      <td>finished</td>\n",
       "      <td>Li</td>\n",
       "      <td>job_a_3_0</td>\n",
       "      <td>/job_a_3_0</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/</td>\n",
       "      <td>2022-07-21 08:22:12.207784</td>\n",
       "      <td>2022-07-21 08:22:15.498472</td>\n",
       "      <td>3.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>Lammps</td>\n",
       "      <td>0.1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>231</td>\n",
       "      <td>finished</td>\n",
       "      <td>Li</td>\n",
       "      <td>job_a_3_167</td>\n",
       "      <td>/job_a_3_167</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/</td>\n",
       "      <td>2022-07-21 08:22:15.639460</td>\n",
       "      <td>2022-07-21 08:22:15.965042</td>\n",
       "      <td>0.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>Lammps</td>\n",
       "      <td>0.1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>232</td>\n",
       "      <td>finished</td>\n",
       "      <td>Li</td>\n",
       "      <td>job_a_3_333</td>\n",
       "      <td>/job_a_3_333</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/</td>\n",
       "      <td>2022-07-21 08:22:16.102696</td>\n",
       "      <td>2022-07-21 08:22:16.395679</td>\n",
       "      <td>0.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>Lammps</td>\n",
       "      <td>0.1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>233</td>\n",
       "      <td>finished</td>\n",
       "      <td>Li</td>\n",
       "      <td>job_a_3_5</td>\n",
       "      <td>/job_a_3_5</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/</td>\n",
       "      <td>2022-07-21 08:22:16.572264</td>\n",
       "      <td>2022-07-21 08:22:16.831195</td>\n",
       "      <td>0.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>Lammps</td>\n",
       "      <td>0.1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>234</td>\n",
       "      <td>finished</td>\n",
       "      <td>Li</td>\n",
       "      <td>job_a_3_667</td>\n",
       "      <td>/job_a_3_667</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/</td>\n",
       "      <td>2022-07-21 08:22:16.975377</td>\n",
       "      <td>2022-07-21 08:22:17.282200</td>\n",
       "      <td>0.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>Lammps</td>\n",
       "      <td>0.1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>235</td>\n",
       "      <td>finished</td>\n",
       "      <td>Li</td>\n",
       "      <td>job_a_3_833</td>\n",
       "      <td>/job_a_3_833</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/</td>\n",
       "      <td>2022-07-21 08:22:17.434798</td>\n",
       "      <td>2022-07-21 08:22:17.694902</td>\n",
       "      <td>0.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>Lammps</td>\n",
       "      <td>0.1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>236</td>\n",
       "      <td>finished</td>\n",
       "      <td>Li</td>\n",
       "      <td>job_a_4_0</td>\n",
       "      <td>/job_a_4_0</td>\n",
       "      <td>None</td>\n",
       "      <td>/home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/</td>\n",
       "      <td>2022-07-21 08:22:17.828141</td>\n",
       "      <td>2022-07-21 08:22:18.089258</td>\n",
       "      <td>0.0</td>\n",
       "      <td>pyiron@lap2p#1</td>\n",
       "      <td>Lammps</td>\n",
       "      <td>0.1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     id    status chemicalformula                 job               subjob  \\\n",
       "0   212  finished            None                full                /full   \n",
       "1   213  finished            None               basic               /basic   \n",
       "2   214  finished            None  Al_basic_atomicrex  /Al_basic_atomicrex   \n",
       "3   226  finished            None        data_lithium        /data_lithium   \n",
       "4   227  finished            None           fit_mode1           /fit_mode1   \n",
       "5   228  finished            None           fit_mode2           /fit_mode2   \n",
       "6   229  finished            None           fit_mode3           /fit_mode3   \n",
       "7   230  finished              Li           job_a_3_0           /job_a_3_0   \n",
       "8   231  finished              Li         job_a_3_167         /job_a_3_167   \n",
       "9   232  finished              Li         job_a_3_333         /job_a_3_333   \n",
       "10  233  finished              Li           job_a_3_5           /job_a_3_5   \n",
       "11  234  finished              Li         job_a_3_667         /job_a_3_667   \n",
       "12  235  finished              Li         job_a_3_833         /job_a_3_833   \n",
       "13  236  finished              Li           job_a_4_0           /job_a_4_0   \n",
       "\n",
       "   projectpath  \\\n",
       "0         None   \n",
       "1         None   \n",
       "2         None   \n",
       "3         None   \n",
       "4         None   \n",
       "5         None   \n",
       "6         None   \n",
       "7         None   \n",
       "8         None   \n",
       "9         None   \n",
       "10        None   \n",
       "11        None   \n",
       "12        None   \n",
       "13        None   \n",
       "\n",
       "                                                                                                                          project  \\\n",
       "0               /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/   \n",
       "1               /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/   \n",
       "2               /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/   \n",
       "3               /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/   \n",
       "4   /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/fit_lithium/   \n",
       "5   /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/fit_lithium/   \n",
       "6   /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/fit_lithium/   \n",
       "7     /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/   \n",
       "8     /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/   \n",
       "9     /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/   \n",
       "10    /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/   \n",
       "11    /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/   \n",
       "12    /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/   \n",
       "13    /home/kalm/dcmnts/uni/promotion/work/workshop_bochum/handson_notebook/workshop_preparation/introduction/training/E_V_curve/   \n",
       "\n",
       "                    timestart                   timestop  totalcputime  \\\n",
       "0  2022-06-03 00:02:34.816533                        NaT           NaN   \n",
       "1  2022-06-03 00:02:47.281055                        NaT           NaN   \n",
       "2  2022-06-03 02:00:15.887059                        NaT           NaN   \n",
       "3  2022-07-21 08:18:19.193090                        NaT           NaN   \n",
       "4  2022-07-21 08:18:22.251691 2022-07-21 08:18:28.328624           6.0   \n",
       "5  2022-07-21 08:18:30.158483 2022-07-21 08:22:02.862973         212.0   \n",
       "6  2022-07-21 08:22:04.136475 2022-07-21 08:22:11.189763           7.0   \n",
       "7  2022-07-21 08:22:12.207784 2022-07-21 08:22:15.498472           3.0   \n",
       "8  2022-07-21 08:22:15.639460 2022-07-21 08:22:15.965042           0.0   \n",
       "9  2022-07-21 08:22:16.102696 2022-07-21 08:22:16.395679           0.0   \n",
       "10 2022-07-21 08:22:16.572264 2022-07-21 08:22:16.831195           0.0   \n",
       "11 2022-07-21 08:22:16.975377 2022-07-21 08:22:17.282200           0.0   \n",
       "12 2022-07-21 08:22:17.434798 2022-07-21 08:22:17.694902           0.0   \n",
       "13 2022-07-21 08:22:17.828141 2022-07-21 08:22:18.089258           0.0   \n",
       "\n",
       "          computer           hamilton hamversion  parentid masterid  \n",
       "0   zora@cmti001#1  TrainingContainer        0.4       NaN     None  \n",
       "1   zora@cmti001#1  TrainingContainer        0.4       NaN     None  \n",
       "2   zora@cmti001#1  TrainingContainer        0.4       NaN     None  \n",
       "3   pyiron@lap2p#1  TrainingContainer        0.4       NaN     None  \n",
       "4   pyiron@lap2p#1          RunnerFit        0.4       NaN     None  \n",
       "5   pyiron@lap2p#1          RunnerFit        0.4     227.0     None  \n",
       "6   pyiron@lap2p#1          RunnerFit        0.4     228.0     None  \n",
       "7   pyiron@lap2p#1             Lammps        0.1       NaN     None  \n",
       "8   pyiron@lap2p#1             Lammps        0.1       NaN     None  \n",
       "9   pyiron@lap2p#1             Lammps        0.1       NaN     None  \n",
       "10  pyiron@lap2p#1             Lammps        0.1       NaN     None  \n",
       "11  pyiron@lap2p#1             Lammps        0.1       NaN     None  \n",
       "12  pyiron@lap2p#1             Lammps        0.1       NaN     None  \n",
       "13  pyiron@lap2p#1             Lammps        0.1       NaN     None  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pr.job_table()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "The training data is stored in the project node `initial`."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "In order to get a feeling for the data, we inspect its energy-volume curve:"
   ]
  },
  {
   "cell_type": "code",
    "slideshow": {
     "slide_type": "fragment"
    }