critcatworks.workflows package

Submodules

critcatworks.workflows.clusgen module

critcatworks.workflows.clusgen.generate_nanoclusters_workflow(username, password, worker_target_path=None, extdb_connect={}, shape='ico', nanocluster_size=3, compositions='', elements=[], generate_pure_nanoclusters=True, n_configurations=10, n_initial_configurations=100, bondlength_dct={})[source]

Workflow to generate binary nanoclusters of defined size and shape. For each binary element combination, for each composition, n_configurations maximally dissimilar structures are created and uploaded to the simulations collection of the mongodb database.

For generating the structures, the cluster generator in the python package cluskit is used.

Parameters
  • username (str) – user who executed the workflow

  • password (str) – password for user to upload to the database

  • worker_target_path (str) – absolute path on computing resource directory needs to exist

  • extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.

  • shape (str) – determines shape of nanoclusters. ‘ico’, ‘octa’ and ‘wulff’

  • nanocluster_size (int) – determines nanocluster size. Meaning depends on shape

  • compositions (list) – each element determines the amount of atoms of type 1.

  • elements (list) – elements (str) to iterate over

  • generate_pure_nanoclusters (bool) – if set to True, also pure nanoclusters are generated

  • n_configurations (int) – number of configurations per composition (chosen based on maximally different structural features)

  • n_initial_configurations (int) – number of initial configurations per composition to choose from (higher number will make the grid finer)

  • bondlength_dct (dict) – bond lengths to use for specific elements. Some default bond lenghts are provided for common elements

Returns

generate_nanoclusters Fireworks Workflow object

Return type

fireworks.Workflow

critcatworks.workflows.coverage module

critcatworks.workflows.coverage.get_coverage_workflow(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate_name='H', max_iterations=10000, adsite_types=['top', 'bridge', 'hollow'], n_max_restarts=1, skip_dft=False, bond_length=1.4, n_remaining='', extdb_connect={})[source]

Workflow to determine a stable coverage of a nanocluster with single adsorbate atoms. As a first step, adsorbates are put on top, bridge and hollow sites. Once the structure is relaxed by DFT, formed adsorbate molecules (pairs of atoms) are replaced by a single adsorbate. The procedure is repeated until no adsorbate molecules form.

Parameters
  • template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.

  • username (str) – user who executed the workflow

  • password (str) – password for user to upload to the database

  • worker_target_path (str) – absolute path on computing resource directory needs to exist

  • structures (list) – list of ase.Atoms objects from where the workflow is started.

  • extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow

  • source_path (str) – absolute path on the computing resource to the directory where to read the structures from

  • reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point

  • adsorbate_name (str) – element symbol of the adsorbed atom

  • max_iterations (int) – maximum number of iterations in the workflow

  • adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”

  • n_max_restarts (int) – number of times the calculation is restarted upon failure

  • skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.

  • bond_length (float) – distance in angstrom under which two adsorbed atoms are considered bound, hence too close

  • n_remaining (int) – number of adsorbates which should remain after the first pre-DFT pruning of the adsorbate coverage

  • extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.

Returns

coverage Fireworks Workflow object

Return type

fireworks.Workflow

critcatworks.workflows.coverageladder module

critcatworks.workflows.coverageladder.get_coverage_ladder_workflow(template_path, username, password, worker_target_path=None, start_ids=None, reference_energy=0.0, free_energy_correction=0.0, adsorbate_name='H', max_iterations=100, n_max_restarts=1, skip_dft=False, bond_length=1.5, d=4, l=2, k=7, initial_direction=1, ranking_metric='similarity', extdb_connect={})[source]

Workflow to determine a stable coverage of a nanocluster with single adsorbate atoms. One adsorbate at a time is added or removed until certain break criteria are met. Currently only d and max_iterations are stopping criterions. d, l, k, initial_direction and ranking_metric are parameters specific to the coverage ladder workflow.

Parameters
  • template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.

  • username (str) – user who executed the workflow

  • password (str) – password for user to upload to the database

  • worker_target_path (str) – absolute path on computing resource directory needs to exist

  • start_ids (list) – unique identifiers of the simulations collection which are used to start the workflow

  • reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point

  • free_energy_correction (float) – free energy correction of the adsorption reaction at hand

  • adsorbate_name (str) – element symbol of the adsorbed atom

  • max_iterations (int) – maximum number of iterations in the workflow

  • n_max_restarts (int) – number of times the calculation is restarted upon failure

  • skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.

  • bond_length (float) – distance in angstrom under which two adsorbed atoms are considered bound, hence too close

  • d (int) – maximum depth of the coverage ladder (termination criterion)

  • l (int) – number of low-energy structures to carry over to the next step

  • k (int) – number of empty candidate sites for adding / adsorbed atoms for removing to consider per step

  • initial_direction (bool) – True will force the initial step to add an adsorbate, False will force the initial step to remove an adsorbate

  • ranking_metric (str) – ‘similarity’ or ‘distance’. Metric based on which to choose k candidates (empty sites / adsorbates)

  • extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.

Returns

coverageladder Fireworks Workflow object

Return type

fireworks.Workflow

critcatworks.workflows.molsinglesites module

critcatworks.workflows.molsinglesites.get_molsinglesites_workflow(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate={}, chunk_size=100, max_calculations=10000, adsite_types=['top', 'bridge', 'hollow'], threshold=0.1, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]

Workflow to determine the adsorption sites and energies of a set of nanocluster structures. The adsorption sites are determined by the python package cluskit and then ranked by farthest point sampling based on their structural local dissimilarity. The adsorption energy is determined by a simulation code (e.g. CP2K) in chunks in a loop. The adsorption energies of the uncalculated structures are inferred by machine learning. Once, the generalization error is low enough, the workflow stops.

Parameters
  • template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.

  • username (str) – user who executed the workflow

  • password (str) – password for user to upload to the database

  • worker_target_path (str) – absolute path on computing resource directory needs to exist

  • structures (list) – list of ase.Atoms objects from where the workflow is started.

  • extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow

  • source_path (str) – absolute path on the computing resource to the directory where to read the structures from

  • reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point

  • adsorbate (dict) – adsorbed molecule as atoms dict. Contains an “X” dummy atom which indicates the anchor point to the nanocluster

  • chunk_size (int) – number of calculations to be run simulataneously. Default -1 means all calculations are run at once.

  • max_calculations (int) – maximum number of iterations in the workflow

  • adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”

  • threshold (float) – ML accuracy of convergence criterion. When below, the workflow is defused.

  • n_max_restarts (int) – number of times the calculation is restarted upon failure

  • skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.

  • extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.

Returns

molsinglesites Fireworks Workflow object

Return type

fireworks.Workflow

critcatworks.workflows.nanoclusters module

critcatworks.workflows.nanoclusters.get_nanoclusters_workflow(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, atomic_energies={}, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]

Workflow to relax the structure of a set of nanoclusters using a simulation software (e.g. CP2K). The cohesive energies are calculated and summarized.

Parameters
  • template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.

  • username (str) – user who executed the workflow

  • password (str) – password for user to upload to the database

  • worker_target_path (str) – absolute path on computing resource directory needs to exist

  • structures (list) – list of ase.Atoms objects from where the workflow is started.

  • extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow

  • source_path (str) – absolute path on the computing resource to the directory where to read the structures from

  • reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point

  • atomic_energies (dict) – used for computing cohesive energies, not required

  • n_max_restarts (int) – number of times the calculation is restarted upon failure

  • skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.

  • extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.

Returns

nanocluster Fireworks Workflow object

Return type

fireworks.Workflow

critcatworks.workflows.simpleflow module

class critcatworks.workflows.simpleflow.SimpleTestTask(*args, **kwargs)[source]

Bases: fireworks.core.firework.FiretaskBase

Simple FireTask to see what attributes FireTasks have.

optional_params = []
required_params = []
run_task(fw_spec)[source]

This method gets called when the Firetask is run. It can take in a Firework spec, perform some task using that data, and then return an output in the form of a FWAction.

Parameters

fw_spec (dict) – A Firework spec. This comes from the master spec. In addition, this spec contains a special “_fw_env” key that contains the env settings of the FWorker calling this method. This provides for abstracting out certain commands or settings. For example, “foo” may be named “foo1” in resource 1 and “foo2” in resource 2. The FWorker env can specify { “foo”: “foo1”}, which maps an abstract variable “foo” to the relevant “foo1” or “foo2”. You can then write a task that uses fw_spec[“_fw_env”][“foo”] that will work across all these multiple resources.

Returns

(FWAction)

critcatworks.workflows.simpleflow.dummy_workflow()[source]

dummy fireworks Workflow

critcatworks.workflows.simpleflow.test_foreachtask_workflow()[source]

Workflow to test fireworks ForeachTask with a dummy workflow

critcatworks.workflows.singlesites module

critcatworks.workflows.singlesites.get_singlesites_workflow(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate_name='H', chunk_size=100, max_calculations=10000, adsite_types=['top', 'bridge', 'hollow'], threshold=0.1, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]

Attention! This workflow is for atomic adsorbates only! Workflow to determine the adsorption sites and energies of a set of nanocluster structures. The adsorption sites are determined by the python package cluskit and then ranked by farthest point sampling based on their structural local dissimilarity. The adsorption energy is determined by a simulation code (e.g. CP2K) in chunks in a loop. The adsorption energies of the uncalculated structures are inferred by machine learning. Once, the generalization error is low enough, the workflow stops.

Parameters
  • template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.

  • username (str) – user who executed the workflow

  • password (str) – password for user to upload to the database

  • worker_target_path (str) – absolute path on computing resource directory needs to exist

  • structures (list) – list of ase.Atoms objects from where the workflow is started.

  • extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow

  • source_path (str) – absolute path on the computing resource to the directory where to read the structures from

  • reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point

  • adsorbate_name (str) – element symbold of the adsorbed atom

  • chunk_size (int) – number of calculations to be run simulataneously. Default -1 means all calculations are run at once.

  • max_calculations (int) – maximum number of iterations in the workflow

  • adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”

  • threshold (float) – ML accuracy of convergence criterion. When below, the workflow is defused.

  • n_max_restarts (int) – number of times the calculation is restarted upon failure

  • skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.

  • extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.

Returns

molsinglesites Fireworks Workflow object

Return type

fireworks.Workflow

critcatworks.workflows.uniquemolsites module

critcatworks.workflows.uniquemolsites.get_uniquemolsites_workflow(template_path, username, password, worker_target_path=None, structures=None, extdb_ids=None, source_path=None, reference_energy=0.0, adsorbate={}, adsite_types=['top', 'bridge', 'hollow'], threshold=0.001, n_max_restarts=1, skip_dft=False, extdb_connect={})[source]

Workflow to determine the adsorption sites and energies of a set of nanocluster structures. The adsorption sites are determined by the python package cluskit and then ranked by farthest point sampling based on their structural local dissimilarity. Only sites which are more dissimilar than the given threshold are computed. The adsorption energy is determined by a simulation code (e.g. CP2K).

Parameters
  • template_path (str) – absolute path to input file for calculations. It works as a template which is later modified by the simulation-specific Firework.

  • username (str) – user who executed the workflow

  • password (str) – password for user to upload to the database

  • worker_target_path (str) – absolute path on computing resource directory needs to exist

  • structures (list) – list of ase.Atoms objects from where the workflow is started.

  • extdb_ids (list) – unique identifiers of the simulations collection which are used to start the workflow

  • source_path (str) – absolute path on the computing resource to the directory where to read the structures from

  • reference_energy (float) – reference energy for the adsorbate. Can be the total energy of the isolated adsorbate molecule or a different reference point

  • adsorbate (dict) – adsorbed molecule as atoms dict. Contains an “X” dummy atom which indicates the anchor point to the nanocluster

  • adsite_types (list) – adsorption site types, can contain any combination of “top”, “bridge”, “hollow”

  • threshold (float) – threshold of similarity metric between the local structures of the adsorption sites. Only sites which are more dissimilar than the given threshold are computed

  • n_max_restarts (int) – number of times the calculation is restarted upon failure

  • skip_dft (bool) – If set to true, the simulation step is skipped in all following simulation runs. Instead the structure is returned unchanged.

  • extdb_connect (dict) – dictionary containing the keys host, username, password, authsource and db_name.

Returns

molsinglesites Fireworks Workflow object

Return type

fireworks.Workflow

Module contents