Bases: BaseDataset
Revised MD (RevMD17) improves upon the MD17 dataset by removing all the numerical noise present in the original
dataset. The data is generated from an ab-initio molecular dynamics (AIMD) simulation where forces and energies
are computed at the PBE/def2-SVP level of theory using very tigh SCF convergence and very dense DFT integration
grid. The dataset contains the following molecules:
Benzene: 627000 samples
Uracil: 133000 samples
Naptalene: 326000 samples
Aspirin: 211000 samples
Salicylic Acid: 320000 samples
Malonaldehyde: 993000 samples
Ethanol: 555000 samples
Toluene: 100000 samples
Usage:
from openqdc.datasets import RevMD17
dataset = RevMD17()
References
https://arxiv.org/abs/2007.09593
Source code in openqdc/datasets/potential/revmd17.py
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110 | class RevMD17(BaseDataset):
"""
Revised MD (RevMD17) improves upon the MD17 dataset by removing all the numerical noise present in the original
dataset. The data is generated from an ab-initio molecular dynamics (AIMD) simulation where forces and energies
are computed at the PBE/def2-SVP level of theory using very tigh SCF convergence and very dense DFT integration
grid. The dataset contains the following molecules:
Benzene: 627000 samples\n
Uracil: 133000 samples\n
Naptalene: 326000 samples\n
Aspirin: 211000 samples\n
Salicylic Acid: 320000 samples\n
Malonaldehyde: 993000 samples\n
Ethanol: 555000 samples\n
Toluene: 100000 samples\n
Usage:
```python
from openqdc.datasets import RevMD17
dataset = RevMD17()
```
References:
https://arxiv.org/abs/2007.09593
"""
__name__ = "revmd17"
__energy_methods__ = [
PotentialMethod.PBE_DEF2_TZVP
# "pbe/def2-tzvp",
]
__force_mask__ = [True]
energy_target_names = [
"PBE-TS Energy",
]
__force_methods__ = [
"pbe/def2-tzvp",
]
force_target_names = [
"PBE-TS Gradient",
]
__links__ = {"revmd17.zip": "https://figshare.com/ndownloader/articles/12672038/versions/3"}
__energy_unit__ = "kcal/mol"
__distance_unit__ = "ang"
__forces_unit__ = "kcal/mol/ang"
def read_raw_entries(self):
entries_list = []
decompress_tar_gz(p_join(self.root, "rmd17.tar.bz2"))
for trajectory in trajectories:
entries_list.append(read_npz_entry(trajectory, self.root))
return entries_list
|