TrainingΒΆ
Training of custom EMLE
models can be performed with the emle-train
executable. It requires a tarball with the reference QM calculations with the
same naming convention as used one for error analysis,
with the difference that only gas phase calculations are required and dipolar
polarizabilies must be present.
Simple usage:
emle-train --orca-tarball orca.tar model.mat
The resulting model.mat
file can then be used with emle-engine
, either
specifying the model via --emle-model
when launcing a server, using the
EMLE_MODEL
environment variable, or as a direct argument to the EMLECalculator
or Torch module constructors.
A full list of argument and their default values can be
printed with emle-train -h
:
usage: emle-train [-h] --orca-tarball name.tar [--train-mask] [--sigma] [--ivm-thr] [--epochs]
[--lr-qeq] [--lr-thole] [--lr-sqrtk] [--print-every] [--computer-n-species]
[--computer-zid-map] [--plot-data name.mat]
output
EMLE training script
positional arguments:
output Output model file
options:
-h, --help show this help message and exit
--orca-tarball name.tar
ORCA tarball (default: None)
--train-mask Mask for training set (default: None)
--sigma Sigma value for GPR (default: 0.001)
--ivm-thr IVM threshold (default: 0.05)
--epochs Number of training epochs (default: 100)
--lr-qeq Learning rate for QEq params (a_QEq, chi_ref) (default: 0.05)
--lr-thole Learning rate for Thole model params (a_Thole, k_Z) (default: 0.05)
--lr-sqrtk Learning rate for polarizability scaling factors (sqrtk_ref) (default: 0.05)
--print-every How often to print training progress (default: 10)
--computer-n-species Number of species supported by AEV computer (default: None)
--computer-zid-map Map between EMLE and AEV computer zid values (default: None)
--plot-data name.mat Data for plotting (default: None)
Here train-mask
is a boolean file containing zeros and ones that defines the
subset of the full training set (provided as --orca-tarball
) that is used for
training. Note that the values written to --plot-data
are for the full training
set, which allows generation of prediction plots for the train/test sets.
The --computer-n-species
and --computer-zid-map
arguments are only
needed when using a common AEV computer for both gas phase backend and EMLE
model.