Atom-centered Symmetry Functions¶
Atom-centered Symmetry Functions (ACSFs) [1] can be used to represent the local environment near an atom by using a fingerprint composed of the output of multiple two- and three-body functions that can be customized to detect specific structural features.
Notice that the DScribe output for ACSF does not include the species of the central atom in any way. If the chemical identify of the central species is important, you may want to consider a custom output stratification scheme based on the chemical identity of the central species, or the inclusion of the species identify in an additional feature. Training a separate model for each central species is also possible.
The ACSF output is stratified by the species of the neighbouring atoms. In pseudo-code the ordering of the output vector is as follows:
for Z in atomic numbers in increasing order:
append value for G1 to output
for G2 in g2_params:
append value for G2 to output
for G3 in g3_params:
append value for G3 to output
for Z in atomic numbers in increasing order:
for Z' in atomic numbers in increasing order:
if and Z' >= Z:
for G4 in g4_params:
append value for G4 to output
for G5 in g5_params:
append value for G5 to output
Setup¶
Instantiating an ACSF descriptor can be done as follows:
from dscribe.descriptors import ACSF
# Setting up the ACSF descriptor
acsf = ACSF(
species=["H", "O"],
rcut=6.0,
g2_params=[[1, 1], [1, 2], [1, 3]],
g4_params=[[1, 1, 1], [1, 2, 1], [1, 1, -1], [1, 2, -1]],
)
The arguments have the following effect:
-
ACSF.
__init__
(rcut, g2_params=None, g3_params=None, g4_params=None, g5_params=None, species=None, periodic=False, sparse=False)[source]¶ - Parameters
rcut (float) – The smooth cutoff value in angstroms. This cutoff value is used throughout the calculations for all symmetry functions.
g2_params (n*2 np.ndarray) – A list of pairs of \(\eta\) and \(R_s\) parameters for \(G^2\) functions.
g3_params (n*1 np.ndarray) – A list of \(\kappa\) parameters for \(G^3\) functions.
g4_params (n*3 np.ndarray) – A list of triplets of \(\eta\), \(\zeta\) and \(\lambda\) parameters for \(G^4\) functions.
g5_params (n*3 np.ndarray) – A list of triplets of \(\eta\), \(\zeta\) and \(\lambda\) parameters for \(G^5\) functions.
species (iterable) – The chemical species as a list of atomic numbers or as a list of chemical symbols. Notice that this is not the atomic numbers that are present for an individual system, but should contain all the elements that are ever going to be encountered when creating the descriptors for a set of systems. Keeping the number of chemical species as low as possible is preferable.
periodic (bool) – Determines whether the system is considered to be periodic.
sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.
Creation¶
After ACSF has been set up, it may be used on atomic structures with the
create()
-method.
from ase.build import molecule
water = molecule("H2O")
# Create MBTR output for the hydrogen atom at index 1
acsf_water = acsf.create(water, positions=[1])
print(acsf_water)
print(acsf_water.shape)
The call syntax for the create-function is as follows:
-
ACSF.
create
(system, positions=None, n_jobs=1, verbose=False)[source]¶ Return the ACSF output for the given systems and given positions.
- Parameters
system (
ase.Atoms
or list ofase.Atoms
) – One or many atomic structures.positions (list) – Positions where to calculate ACSF. Can be provided as cartesian positions or atomic indices. If no positions are defined, the SOAP output will be created for all atoms in the system. When calculating SOAP for multiple systems, provide the positions as a list for each system.
n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.
verbose (bool) – Controls whether to print the progress of each job into to the console.
- Returns
The ACSF output for the given systems and positions. The return type depends on the ‘sparse’-attribute. The first dimension is determined by the amount of positions and systems and the second dimension is determined by the get_number_of_features()-function. When multiple systems are provided the results are ordered by the input order of systems and their positions.
- Return type
np.ndarray | scipy.sparse.csr_matrix
The output will in this case be a numpy array with shape [#positions,
#features]. The number of features may be requested beforehand with the
get_number_of_features()
-method.
- 1
Jörg Behler. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys., 134(7):074106, 2011.