critcatworks.workflows package¶
Submodules¶
critcatworks.workflows.clusgen module¶
-
critcatworks.workflows.clusgen.
generate_nanoclusters_workflow
(username, password, worker_target_path=None, extdb_connect={}, shape='ico', nanocluster_size=3, compositions='', elements=[], generate_pure_nanoclusters=True, n_configurations=10, n_initial_configurations=100, bondlength_dct={})[source]¶ Workflow to generate binary nanoclusters of defined size and shape. For each binary element combination, for each composition, n_configurations maximally dissimilar structures are created and uploaded to the simulations collection of the mongodb database.
For generating the structures, the cluster generator in the python package cluskit is used.
- Parameters
username (str) – user who executed the workflow
password (str) – password for user to upload to the database
worker_target_path (str) – absolute path on computing resource directory needs to exist
extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.
shape (str) – determines shape of nanoclusters. ‘ico’, ‘octa’ and ‘wulff’
nanocluster_size (int) – determines nanocluster size. Meaning depends on shape
compositions (list) – each element determines the amount of atoms of type 1.
elements (list) – elements (str) to iterate over
generate_pure_nanoclusters (bool) – if set to True, also pure nanoclusters are generated
n_configurations (int) – number of configurations per composition (chosen based on maximally different structural features)
n_initial_configurations (int) – number of initial configurations per composition to choose from (higher number will make the grid finer)
bondlength_dct (dict) – bond lengths to use for specific elements. Some default bond lenghts are provided for common elements
- Returns
generate_nanoclusters Fireworks Workflow object
- Return type
fireworks.Workflow
critcatworks.workflows.coverage module¶
-
critcatworks.workflows.coverage.
get_coverage_workflow
(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate_name='H', max_iterations=10000, adsite_types=['top', 'bridge', 'hollow'], n_max_restarts=1, skip_dft=False, bond_length=1.4, n_remaining='', extdb_connect={})[source]¶ Workflow to determine a stable coverage of a nanocluster with single adsorbate atoms. As a first step, adsorbates are put on top, bridge and hollow sites. Once the structure is relaxed by DFT, formed adsorbate molecules (pairs of atoms) are replaced by a single adsorbate. The procedure is repeated until no adsorbate molecules form.
- Parameters
template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.
username (str) – user who executed the workflow
password (str) – password for user to upload to the database
worker_target_path (str) – absolute path on computing resource directory needs to exist
structures (list) – list of ase.Atoms objects from where the workflow is started.
extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow
source_path (str) – absolute path on the computing resource to the directory where to read the structures from
reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point
adsorbate_name (str) – element symbol of the adsorbed atom
max_iterations (int) – maximum number of iterations in the workflow
adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”
n_max_restarts (int) – number of times the calculation is restarted upon failure
skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.
bond_length (float) – distance in angstrom under which two adsorbed atoms are considered bound, hence too close
n_remaining (int) – number of adsorbates which should remain after the first pre-DFT pruning of the adsorbate coverage
extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.
- Returns
coverage Fireworks Workflow object
- Return type
fireworks.Workflow
critcatworks.workflows.coverageladder module¶
-
critcatworks.workflows.coverageladder.
get_coverage_ladder_workflow
(template_path, username, password, worker_target_path=None, start_ids=None, reference_energy=0.0, free_energy_correction=0.0, adsorbate_name='H', max_iterations=100, n_max_restarts=1, skip_dft=False, bond_length=1.5, d=4, l=2, k=7, initial_direction=1, ranking_metric='similarity', extdb_connect={})[source]¶ Workflow to determine a stable coverage of a nanocluster with single adsorbate atoms. One adsorbate at a time is added or removed until certain break criteria are met. Currently only d and max_iterations are stopping criterions. d, l, k, initial_direction and ranking_metric are parameters specific to the coverage ladder workflow.
- Parameters
template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.
username (str) – user who executed the workflow
password (str) – password for user to upload to the database
worker_target_path (str) – absolute path on computing resource directory needs to exist
start_ids (list) – unique identifiers of the simulations collection which are used to start the workflow
reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point
free_energy_correction (float) – free energy correction of the adsorption reaction at hand
adsorbate_name (str) – element symbol of the adsorbed atom
max_iterations (int) – maximum number of iterations in the workflow
n_max_restarts (int) – number of times the calculation is restarted upon failure
skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.
bond_length (float) – distance in angstrom under which two adsorbed atoms are considered bound, hence too close
d (int) – maximum depth of the coverage ladder (termination criterion)
l (int) – number of low-energy structures to carry over to the next step
k (int) – number of empty candidate sites for adding / adsorbed atoms for removing to consider per step
initial_direction (bool) – True will force the initial step to add an adsorbate, False will force the initial step to remove an adsorbate
ranking_metric (str) – ‘similarity’ or ‘distance’. Metric based on which to choose k candidates (empty sites / adsorbates)
extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.
- Returns
coverageladder Fireworks Workflow object
- Return type
fireworks.Workflow
critcatworks.workflows.molsinglesites module¶
-
critcatworks.workflows.molsinglesites.
get_molsinglesites_workflow
(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate={}, chunk_size=100, max_calculations=10000, adsite_types=['top', 'bridge', 'hollow'], threshold=0.1, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]¶ Workflow to determine the adsorption sites and energies of a set of nanocluster structures. The adsorption sites are determined by the python package cluskit and then ranked by farthest point sampling based on their structural local dissimilarity. The adsorption energy is determined by a simulation code (e.g. CP2K) in chunks in a loop. The adsorption energies of the uncalculated structures are inferred by machine learning. Once, the generalization error is low enough, the workflow stops.
- Parameters
template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.
username (str) – user who executed the workflow
password (str) – password for user to upload to the database
worker_target_path (str) – absolute path on computing resource directory needs to exist
structures (list) – list of ase.Atoms objects from where the workflow is started.
extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow
source_path (str) – absolute path on the computing resource to the directory where to read the structures from
reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point
adsorbate (dict) – adsorbed molecule as atoms dict. Contains an “X” dummy atom which indicates the anchor point to the nanocluster
chunk_size (int) – number of calculations to be run simulataneously. Default -1 means all calculations are run at once.
max_calculations (int) – maximum number of iterations in the workflow
adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”
threshold (float) – ML accuracy of convergence criterion. When below, the workflow is defused.
n_max_restarts (int) – number of times the calculation is restarted upon failure
skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.
extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.
- Returns
molsinglesites Fireworks Workflow object
- Return type
fireworks.Workflow
critcatworks.workflows.nanoclusters module¶
-
critcatworks.workflows.nanoclusters.
get_nanoclusters_workflow
(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, atomic_energies={}, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]¶ Workflow to relax the structure of a set of nanoclusters using a simulation software (e.g. CP2K). The cohesive energies are calculated and summarized.
- Parameters
template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.
username (str) – user who executed the workflow
password (str) – password for user to upload to the database
worker_target_path (str) – absolute path on computing resource directory needs to exist
structures (list) – list of ase.Atoms objects from where the workflow is started.
extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow
source_path (str) – absolute path on the computing resource to the directory where to read the structures from
reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point
atomic_energies (dict) – used for computing cohesive energies, not required
n_max_restarts (int) – number of times the calculation is restarted upon failure
skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.
extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.
- Returns
nanocluster Fireworks Workflow object
- Return type
fireworks.Workflow
critcatworks.workflows.simpleflow module¶
-
class
critcatworks.workflows.simpleflow.
SimpleTestTask
(*args, **kwargs)[source]¶ Bases:
fireworks.core.firework.FiretaskBase
Simple FireTask to see what attributes FireTasks have.
-
optional_params
= []¶
-
required_params
= []¶
-
run_task
(fw_spec)[source]¶ This method gets called when the Firetask is run. It can take in a Firework spec, perform some task using that data, and then return an output in the form of a FWAction.
- Parameters
fw_spec (dict) – A Firework spec. This comes from the master spec. In addition, this spec contains a special “_fw_env” key that contains the env settings of the FWorker calling this method. This provides for abstracting out certain commands or settings. For example, “foo” may be named “foo1” in resource 1 and “foo2” in resource 2. The FWorker env can specify { “foo”: “foo1”}, which maps an abstract variable “foo” to the relevant “foo1” or “foo2”. You can then write a task that uses fw_spec[“_fw_env”][“foo”] that will work across all these multiple resources.
- Returns
(FWAction)
-
critcatworks.workflows.singlesites module¶
-
critcatworks.workflows.singlesites.
get_singlesites_workflow
(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate_name='H', chunk_size=100, max_calculations=10000, adsite_types=['top', 'bridge', 'hollow'], threshold=0.1, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]¶ Attention! This workflow is for atomic adsorbates only! Workflow to determine the adsorption sites and energies of a set of nanocluster structures. The adsorption sites are determined by the python package cluskit and then ranked by farthest point sampling based on their structural local dissimilarity. The adsorption energy is determined by a simulation code (e.g. CP2K) in chunks in a loop. The adsorption energies of the uncalculated structures are inferred by machine learning. Once, the generalization error is low enough, the workflow stops.
- Parameters
template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.
username (str) – user who executed the workflow
password (str) – password for user to upload to the database
worker_target_path (str) – absolute path on computing resource directory needs to exist
structures (list) – list of ase.Atoms objects from where the workflow is started.
extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow
source_path (str) – absolute path on the computing resource to the directory where to read the structures from
reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point
adsorbate_name (str) – element symbold of the adsorbed atom
chunk_size (int) – number of calculations to be run simulataneously. Default -1 means all calculations are run at once.
max_calculations (int) – maximum number of iterations in the workflow
adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”
threshold (float) – ML accuracy of convergence criterion. When below, the workflow is defused.
n_max_restarts (int) – number of times the calculation is restarted upon failure
skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.
extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.
- Returns
molsinglesites Fireworks Workflow object
- Return type
fireworks.Workflow
critcatworks.workflows.uniquemolsites module¶
-
critcatworks.workflows.uniquemolsites.
get_uniquemolsites_workflow
(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate={}, adsite_types=['top', 'bridge', 'hollow'], threshold=0.001, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]¶ Workflow to determine the adsorption sites and energies of a set of nanocluster structures. The adsorption sites are determined by the python package cluskit and then ranked by farthest point sampling based on their structural local dissimilarity. Only sites which are more dissimilar than the given threshold are computed. The adsorption energy is determined by a simulation code (e.g. CP2K).
- Parameters
template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.
username (str) – user who executed the workflow
password (str) – password for user to upload to the database
worker_target_path (str) – absolute path on computing resource directory needs to exist
structures (list) – list of ase.Atoms objects from where the workflow is started.
extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow
source_path (str) – absolute path on the computing resource to the directory where to read the structures from
reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point
adsorbate (dict) – adsorbed molecule as atoms dict. Contains an “X” dummy atom which indicates the anchor point to the nanocluster
adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”
threshold (float) – threshold of similarity metric between the local structures of the adsorption sites. Only sites which are more dissimilar than the given threshold are computed
n_max_restarts (int) – number of times the calculation is restarted upon failure
skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.
extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.
- Returns
molsinglesites Fireworks Workflow object
- Return type
fireworks.Workflow