Atom-centered Symmetry Functions

Atom-centered Symmetry Functions (ACSFs) [1] can be used to represent the local environment near an atom by using a fingerprint composed of the output of multiple two- and three-body functions that can be customized to detect specific structural features.

Notice that the DScribe output for ACSF does not include the species of the central atom in any way. If the chemical identify of the central species is important, you may want to consider a custom output stratification scheme based on the chemical identity of the central species, or the inclusion of the species identify in an additional feature. Training a separate model for each central species is also possible.

The ACSF output is stratified by the species of the neighbouring atoms. In pseudo-code the ordering of the output vector is as follows:

for Z in atomic numbers in increasing order:
    append value for G1 to output
    for G2 in g2_params:
        append value for G2 to output
    for G3 in g3_params:
        append value for G3 to output
for Z in atomic numbers in increasing order:
   for Z' in atomic numbers in increasing order:
       if and Z' >= Z:
           for G4 in g4_params:
               append value for G4 to output
           for G5 in g5_params:
               append value for G5 to output

Setup

Instantiating an ACSF descriptor can be done as follows:

from dscribe.descriptors import ACSF

# Setting up the ACSF descriptor
acsf = ACSF(
    species=["H", "O"],
    rcut=6.0,
    g2_params=[[1, 1], [1, 2], [1, 3]],
    g4_params=[[1, 1, 1], [1, 2, 1], [1, 1, -1], [1, 2, -1]],
)

The arguments have the following effect:

ACSF.__init__(rcut, g2_params=None, g3_params=None, g4_params=None, g5_params=None, species=None, periodic=False, sparse=False)[source]
Parameters
  • rcut (float) – The smooth cutoff value in angstroms. This cutoff value is used throughout the calculations for all symmetry functions.

  • g2_params (n*2 np.ndarray) – A list of pairs of \(\eta\) and \(R_s\) parameters for \(G^2\) functions.

  • g3_params (n*1 np.ndarray) – A list of \(\kappa\) parameters for \(G^3\) functions.

  • g4_params (n*3 np.ndarray) – A list of triplets of \(\eta\), \(\zeta\) and \(\lambda\) parameters for \(G^4\) functions.

  • g5_params (n*3 np.ndarray) – A list of triplets of \(\eta\), \(\zeta\) and \(\lambda\) parameters for \(G^5\) functions.

  • species (iterable) – The chemical species as a list of atomic numbers or as a list of chemical symbols. Notice that this is not the atomic numbers that are present for an individual system, but should contain all the elements that are ever going to be encountered when creating the descriptors for a set of systems. Keeping the number of chemical species as low as possible is preferable.

  • periodic (bool) – Determines whether the system is considered to be periodic.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

Creation

After ACSF has been set up, it may be used on atomic structures with the create()-method.

from ase.build import molecule

water = molecule("H2O")

# Create MBTR output for the hydrogen atom at index 1
acsf_water = acsf.create(water, positions=[1])

print(acsf_water)
print(acsf_water.shape)

The call syntax for the create-function is as follows:

ACSF.create(system, positions=None, n_jobs=1, verbose=False)[source]

Return the ACSF output for the given systems and given positions.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • positions (list) – Positions where to calculate ACSF. Can be provided as cartesian positions or atomic indices. If no positions are defined, the SOAP output will be created for all atoms in the system. When calculating SOAP for multiple systems, provide the positions as a list for each system.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

The ACSF output for the given systems and positions. The return type depends on the ‘sparse’-attribute. The first dimension is determined by the amount of positions and systems and the second dimension is determined by the get_number_of_features()-function. When multiple systems are provided the results are ordered by the input order of systems and their positions.

Return type

np.ndarray | scipy.sparse.csr_matrix

The output will in this case be a numpy array with shape [#positions, #features]. The number of features may be requested beforehand with the get_number_of_features()-method.

1

Jörg Behler. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys., 134(7):074106, 2011.