Qmugs
QMugs
¶
Bases: BaseDataset
The QMugs dataset contains 2 million conformers for 665k biologically and pharmacologically relevant molecules extracted from the ChEMBL database. Three geometries per molecule are generated and optimized using the GFN2-xTB method. Using the optimized geometry, the atomic and molecular properties are calculated using both, semi-empirical method (GFN2-xTB) and DFT method (ωB97X-D/def2-SVP).
Usage:
from openqdc.datasets import QMugs
dataset = QMugs()
References
https://arxiv.org/abs/2107.00367
https://www.nature.com/articles/s41597-022-01390-7#ethics
https://www.research-collection.ethz.ch/handle/20.500.11850/482129
Source code in openqdc/datasets/potential/qmugs.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
QMugs_V2
¶
Bases: QMugs
QMugs_V2 is an extension of the QMugs dataset containing PM6 labels for each of the 4.2M geometries.
Usage:
from openqdc.datasets import QMugs_V2
dataset = QMugs_V2()
Source code in openqdc/datasets/potential/qmugs.py
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|