Skip to content

Transition1X

Transition1X

Bases: BaseDataset

Transition1x dataset contains structures from 10k organic reaction pathways of various types. It contains energy and force labels for 9.6 mio. conformers calculated at the wB97x/6-31-G(d) level of theory. The geometries and the transition states are generated by running Nudged Elastic Band (NEB) with DFT.

Usage:

from openqdc.datasets import Transition1X
dataset = Transition1X()

References: - https://www.nature.com/articles/s41597-022-01870-w

Source code in openqdc/datasets/potential/transition1x.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
class Transition1X(BaseDataset):
    """
    Transition1x dataset contains structures from 10k organic reaction pathways of various types. It contains energy
    and force labels for 9.6 mio. conformers calculated at the wB97x/6-31-G(d) level of theory. The geometries and
    the transition states are generated by running Nudged Elastic Band (NEB) with DFT.

    Usage:
    ```python
    from openqdc.datasets import Transition1X
    dataset = Transition1X()
    ```

    References:
    - https://www.nature.com/articles/s41597-022-01870-w\n
    - https://gitlab.com/matschreiner/Transition1x\n
    """

    __name__ = "transition1x"

    __energy_methods__ = [
        PotentialMethod.WB97X_6_31G_D
        # "wb97x/6-31G(d)",
    ]

    energy_target_names = [
        "wB97x_6-31G(d).energy",
    ]

    __force_mask__ = [True]
    force_target_names = [
        "wB97x_6-31G(d).forces",
    ]

    __energy_unit__ = "ev"
    __distance_unit__ = "ang"
    __forces_unit__ = "ev/ang"
    __links__ = {"Transition1x.h5": "https://figshare.com/ndownloader/files/36035789"}

    def read_raw_entries(self):
        raw_path = p_join(self.root, "Transition1x.h5")
        f = load_hdf5_file(raw_path)["data"]

        res = sum([read_record(f[g], group=g) for g in tqdm(f)], [])  # don't use parallelized here
        return res