Overview¶
OpenQDC is a python library to work with quantum datasets. It's a package aimed at providing a simple and efficient way to download, load and utilize various datasets and provide a way to standardize the data for easy use in machine learning models.
- đ Simple pythonic API
- đšī¸ ML-Ready: all you manipulate are
torch.Tensor
,jax.Array
ornumpy.Array
objects. - âī¸ Quantum Ready: The quantum methods are checked and standardized to provide addictional values.
- â Standardized: The datasets are written in standard and performant formats with annotated metadata like units and labels.
- đ§ Performance matters: read and write multiple formats (memmap, zarr, xyz, etc).
- đ Data: have access to 1.5+ billion datapoints
Visit our website at https://openqdc.io .
Installation¶
Use mamba:
conda install -c conda-forge openqdc
Tips: You can replace conda
by mamba
.
Note: We highly recommend using a Conda Python distribution to install OpenQDC. The package is also pip installable if you need it: pip install openqdc
.
Quick API Tour¶
from openqdc as Spice
# Load the original dataset
dataset = Spice()
# Load the dataset with a different units
dataset = Spice(
energy_unit = "kcal/mol",
distance_unit = "ang",
energy_type = "formation",
array_format = "torch"
)
# Access the data
data = dataset[0]
# Get relevant statistics
dataset.get_statistics()
# Get dataset metadata
dataset.average_n_atoms
dataset.chemical_species
dataset.charges
# Compute physical descriptors
dataset.calculate_descriptors(
descriptor_name = "soap"
)
How to cite¶
Please cite OpenQDC if you use it in your research: .
Compatibilities¶
OpenQDC is compatible with Python >= 3.8 and is tested on Linux, MacOS and Windows.