dscribe.descriptors package

Submodules

dscribe.descriptors.acsf module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.acsf.ACSF(rcut, g2_params=None, g3_params=None, g4_params=None, g5_params=None, species=None, periodic=False, sparse=False)[source]

Bases: dscribe.descriptors.descriptor.Descriptor

Implementation of Atom-Centered Symmetry Functions. Currently valid for finite systems only.

Notice that the species of the central atom is not encoded in the output, only the surrounding environment is encoded. In a typical application one can train a different model for each central species.

For reference, see:

“Atom-centered symmetry functions for constructing high-dimensional neural network potentials”, Jörg Behler, The Journal of Chemical Physics, 134, 074106 (2011), https://doi.org/10.1063/1.3553717

Parameters
  • rcut (float) – The smooth cutoff value in angstroms. This cutoff value is used throughout the calculations for all symmetry functions.

  • g2_params (n*2 np.ndarray) – A list of pairs of \(\eta\) and \(R_s\) parameters for \(G^2\) functions.

  • g3_params (n*1 np.ndarray) – A list of \(\kappa\) parameters for \(G^3\) functions.

  • g4_params (n*3 np.ndarray) – A list of triplets of \(\eta\), \(\zeta\) and \(\lambda\) parameters for \(G^4\) functions.

  • g5_params (n*3 np.ndarray) – A list of triplets of \(\eta\), \(\zeta\) and \(\lambda\) parameters for \(G^5\) functions.

  • species (iterable) – The chemical species as a list of atomic numbers or as a list of chemical symbols. Notice that this is not the atomic numbers that are present for an individual system, but should contain all the elements that are ever going to be encountered when creating the descriptors for a set of systems. Keeping the number of chemical species as low as possible is preferable.

  • periodic (bool) – Determines whether the system is considered to be periodic.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system, positions=None, n_jobs=1, verbose=False)[source]

Return the ACSF output for the given systems and given positions.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • positions (list) – Positions where to calculate ACSF. Can be provided as cartesian positions or atomic indices. If no positions are defined, the SOAP output will be created for all atoms in the system. When calculating SOAP for multiple systems, provide the positions as a list for each system.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

The ACSF output for the given systems and positions. The return type depends on the ‘sparse’-attribute. The first dimension is determined by the amount of positions and systems and the second dimension is determined by the get_number_of_features()-function. When multiple systems are provided the results are ordered by the input order of systems and their positions.

Return type

np.ndarray | scipy.sparse.csr_matrix

create_single(system, positions=None)[source]

Creates the descriptor for the given system.

Parameters
  • system (ase.Atoms | System) – Input system.

  • positions (iterable) – Indices of the atoms around which the ACSF will be returned. If no positions defined, ACSF will be created for all atoms in the system.

Returns

The ACSF output for the given system and positions. The return type depends on the ‘sparse’-attribute. The first dimension is given by the number of positions and the second dimension is determined by the get_number_of_features()-function.

Return type

np.ndarray | scipy.sparse.coo_matrix

property g2_params
property g3_params
property g4_params
property g5_params
get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

Returns

Number of features for this descriptor.

Return type

int

property rcut
property species

dscribe.descriptors.coulombmatrix module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.coulombmatrix.CoulombMatrix(n_atoms_max, permutation='sorted_l2', sigma=None, seed=None, flatten=True, sparse=False)[source]

Bases: dscribe.descriptors.matrixdescriptor.MatrixDescriptor

Calculates the zero padded Coulomb matrix.

The Coulomb matrix is defined as:

C_ij = 0.5 Zi**exponent | i = j

= (Zi*Zj)/(Ri-Rj) | i != j

The matrix is padded with invisible atoms, which means that the matrix is padded with zeros until the maximum allowed size defined by n_max_atoms is reached.

To reach invariance against permutation of atoms, specify a valid option for the permutation parameter.

For reference, see:

“Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning”, Matthias Rupp, Alexandre Tkatchenko, Klaus-Robert Mueller, and O. Anatole von Lilienfeld, Phys. Rev. Lett, (2012), https://doi.org/10.1103/PhysRevLett.108.058301

and

“Learning Invariant Representations of Molecules for Atomization Energy Prediction”, Gregoire Montavon et. al, Advances in Neural Information Processing Systems 25 (NIPS 2012)

Parameters
  • n_atoms_max (int) – The maximum nuber of atoms that any of the samples can have. This controls how much zeros need to be padded to the final result.

  • permutation (string) –

    Defines the method for handling permutational invariance. Can be one of the following:

    • none: The matrix is returned in the order defined by the Atoms.

    • sorted_l2: The rows and columns are sorted by the L2 norm.

    • eigenspectrum: Only the eigenvalues are returned sorted by their absolute value in descending order.

    • random: The rows and columns are sorted by their L2 norm after applying Gaussian noise to the norms. The standard deviation of the noise is determined by the sigma-parameter.

  • sigma (float) – Provide only when using the random-permutation option. Standard deviation of the gaussian distributed noise determining how much the rows and columns of the randomly sorted matrix are scrambled.

  • seed (int) – Provide only when using the random-permutation option. A seed to use for drawing samples from a normal distribution.

  • flatten (bool) – Whether the output of create() should be flattened to a 1D array.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system, n_jobs=1, verbose=False)[source]

Return the Coulomb matrix for the given systems.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

Coulomb matrix for the given systems. The return type depends on the ‘sparse’ and ‘flatten’-attributes. For flattened output a single numpy array or sparse scipy.csr_matrix is returned. The first dimension is determined by the amount of systems.

Return type

np.ndarray | scipy.sparse.csr_matrix

get_matrix(system)[source]

Creates the Coulomb matrix for the given system.

Parameters

system (ase.Atoms | System) – Input system.

Returns

Coulomb matrix as a 2D array.

Return type

np.ndarray

dscribe.descriptors.descriptor module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.descriptor.Descriptor(periodic, flatten, sparse)[source]

Bases: abc.ABC

An abstract base class for all descriptors.

Parameters

flatten (bool) – Whether the output of create() should be flattened to a 1D array.

check_atomic_numbers(atomic_numbers)[source]

Used to check that the given atomic numbers have been defined for this descriptor.

Parameters

species (iterable) – Atomic numbers to check.

Raises
  • ValueError – If the atomic numbers in the given system are not

  • included in the species given to this descriptor.

abstract create(system, *args, **kwargs)[source]

Creates the descriptor for the given systems.

Parameters
  • system (ase.Atoms) – The system for which to create the descriptor.

  • args – Descriptor specific positional arguments.

  • kwargs – Descriptor specific keyword arguments.

Returns

A descriptor for the system.

Return type

np.array | scipy.sparse.coo_matrix

create_parallel(inp, func, n_jobs, output_sizes=None, verbose=False, prefer='processes')[source]

Used to parallelize the descriptor creation across multiple systems.

Parameters
  • inp (list) – Contains a tuple of input arguments for each processed system. These arguments are fed to the function specified by “func”.

  • func (function) – Function that outputs the descriptor when given input arguments from “inp”.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • output_sizes (list of ints) – The size of the output for each job. Makes the creation faster by preallocating the correct amount of memory beforehand. If not specified, a dynamically created list of outputs is used.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

  • backend (str) –

    The parallelization method. Valid options are:

    • ”processes”: Parallelization based on processes. Uses the “loky” backend in joblib to serialize the jobs and run them in separate processes. Using separate processes has a bigger memory and initialization overhead than threads, but may provide better scalability if perfomance is limited by the Global Interpreter Lock (GIL).

    • ”threads”: Parallelization based on threads. Has bery low memory and initialization overhead. Performance is limited by the amount of pure python code that needs to run. Ideal when most of the calculation time is used by C/C++ extensions that release the GIL.

Returns

The descriptor output for each given input. The return type depends on the desciptor setup.

Return type

np.ndarray | scipy.sparse.csr_matrix | list

property flatten
abstract get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

Returns

Number of features for this descriptor.

Return type

int

get_system(system)[source]

Used to convert the given atomic system into a custom System-object that is used internally. The System class inherits from ase.Atoms, but includes built-in caching for geometric quantities that may be re-used by the descriptors.

Parameters

system (ase.Atoms | System) – Input system.

Returns

The given system transformed into a corresponding

System-object.

Return type

System

property periodic
property sparse

dscribe.descriptors.elementaldistribution module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.elementaldistribution.ElementalDistribution(properties, flatten=True, sparse=True)[source]

Bases: dscribe.descriptors.descriptor.Descriptor

Represents a generic distribution on any given grid for any given properties. Can create both continuos and discrete distributions.

Continuous distributions require a standard deviation and the number of sampling points. You can also specify the minimum and maximum values for the axis. If these are not specified, a limit is selected based automatically on the values with:

min = values.min() - 3*std max = values.max() + 3*std

Discrete distributions are assumed to be integer values, and you only need to specify the values.

Parameters
  • properties (dict) –

    Contains a description of the elemental property for which a distribution is created. Should contain a dictionary of the following form:

    properties={
    “property_name”: {

    “type”: “continuous” “min”: <Distribution minimum value> “max”: <Distribution maximum value> “std”: <Distribution standard deviation> “n”: <Number of discrete samples from distribution> “values”: {

    ”H”: <Value for hydrogen> …

    }

    ”property_name2”: {

    “type”: “discrete” “values”: {

    ”H”: <Value for hydrogen> …

    }

    }

  • flatten (bool) – Whether to flatten out the result.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system)[source]
Parameters

system (ase.Atoms | System) – Input system.

Returns

The concatenated distributions of the

specified properties in a sparse array.

Return type

scipy.sparse.lil_matrix

gaussian_sum(centers, weights, minimum, maximum, std, n)[source]

Calculates a discrete version of a sum of Gaussian distributions.

The calculation is done through the cumulative distribution function that is better at keeping the integral of the probability function constant with coarser grids.

The values are normalized by dividing with the maximum value of a gaussian with the given standard deviation.

Parameters
  • centers (1D np.ndarray) – The means of the gaussians.

  • weights (1D np.ndarray) – The weights for the gaussians.

  • minimum (float) – The minimum grid value

  • maximum (float) – The maximum grid value

  • std (float) – Standard deviation of the gaussian

  • n (int) – Number of grid points

  • settings (dict) – The grid settings. A dictionary containing the following information:

Returns

Value of the gaussian sums on the given grid.

get_axis(property_name)[source]

Used to return the used x-axis for the given property.

Parameters
  • property_name (str) – The property name that was used in the

  • constructor.

Returns

An array of x-axis values.

Return type

np.ndarray

get_element_occurrence(system)[source]

Calculate the count of each atomic element in the given system.

Parameters

system (ase.Atoms) – The atomic system.

Returns

The counts for each element in a list where the index of atomic number x is self.atomic_number_to_index[x]

Return type

1D ndarray

get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

Returns

Number of features for this descriptor.

Return type

int

dscribe.descriptors.ewaldsummatrix module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.ewaldsummatrix.EwaldSumMatrix(n_atoms_max, permutation='sorted_l2', sigma=None, seed=None, flatten=True, sparse=False)[source]

Bases: dscribe.descriptors.matrixdescriptor.MatrixDescriptor

Calculates an Ewald sum matrix for the a given system.

Each entry M_ij of the Ewald sum matrix will contain the Coulomb energy between atoms i and j calculated with the Ewald summation method. In the Ewald method a constant neutralizing background charge has been added to counteract the positive net charge.

The total electrostatic interaction energy in the system can calculated by summing the upper diagonal part of the matrix, including the diagonal itself.

A screening parameter a controls the width of the Gaussian charge distributions in the Ewald summation, but the final matrix elements will be independent of the value of the screening parameter a that is used, as long as sufficient cutoff values are used.

This implementation provides default values for

For reference, see:

“Crystal Structure Representations for Machine Learning Models of Formation Energies”, Felix Faber, Alexander Lindmaa, Anatole von Lilienfeld, and Rickard Armiento, International Journal of Quantum Chemistry, (2015), https://doi.org/10.1002/qua.24917

and

“Ewald summation techniques in perspective: a survey”, Abdulnour Y. Toukmaji, John A. Board Jr., Computer Physics Communications, (1996) https://doi.org/10.1016/0010-4655(96)00016-1

and

“R.A. Jackson and C.R.A. Catlow. Computer simulation studies of zeolite structure. Mol. Simul., 1:207-224, 1988, https://doi.org/10.1080/08927022.2013.840898

Parameters
  • n_atoms_max (int) – The maximum nuber of atoms that any of the samples can have. This controls how much zeros need to be padded to the final result.

  • permutation (string) –

    Defines the method for handling permutational invariance. Can be one of the following:

    • none: The matrix is returned in the order defined by the Atoms.

    • sorted_l2: The rows and columns are sorted by the L2 norm.

    • eigenspectrum: Only the eigenvalues are returned sorted by their absolute value in descending order.

    • random: The rows and columns are sorted by their L2 norm after applying Gaussian noise to the norms. The standard deviation of the noise is determined by the sigma-parameter.

  • sigma (float) – Provide only when using the random-permutation option. Standard deviation of the gaussian distributed noise determining how much the rows and columns of the randomly sorted matrix are scrambled.

  • seed (int) – Provide only when using the random-permutation option. A seed to use for drawing samples from a normal distribution.

  • flatten (bool) – Whether the output of create() should be flattened to a 1D array.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system, accuracy=1e-05, w=1, rcut=None, gcut=None, a=None, n_jobs=1, verbose=False)[source]

Return the Coulomb matrix for the given systems.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • accuracy (float) – The accuracy to which the sum is converged to. Corresponds to the variable \(A\) in https://doi.org/10.1080/08927022.2013.840898. Used only if gcut, rcut and a have not been specified. Provide either one value or a list of values for each system.

  • w (float) – Weight parameter that represents the relative computational expense of calculating a term in real and reciprocal space. This has little effect on the total energy, but may influence speed of computation in large systems. Note that this parameter is used only when the cutoffs and a are set to None. Provide either one value or a list of values for each system.

  • rcut (float) – Real space cutoff radius dictating how many terms are used in the real space sum. Provide either one value or a list of values for each system.

  • gcut (float) – Reciprocal space cutoff radius. Provide either one value or a list of values for each system.

  • a (float) – The screening parameter that controls the width of the Gaussians. If not provided, a default value of \(\alpha = \sqrt{\pi}\left(\frac{N}{V^2}\right)^{1/6}\) is used. Corresponds to the standard deviation of the Gaussians. Provide either one value or a list of values for each system.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

Ewald sum matrix for the given systems. The return type depends on the ‘sparse’ and ‘flatten’-attributes. For flattened output a single numpy array or sparse scipy.csr_matrix is returned. The first dimension is determined by the amount of systems.

Return type

np.ndarray | scipy.sparse.csr_matrix

create_single(system, accuracy=1e-05, w=1, rcut=None, gcut=None, a=None)[source]
Parameters
  • system (ase.Atoms | System) – Input system.

  • accuracy (float) – The accuracy to which the sum is converged to. Corresponds to the variable \(A\) in https://doi.org/10.1080/08927022.2013.840898. Used only if gcut, rcut and a have not been specified.

  • w (float) – Weight parameter that represents the relative computational expense of calculating a term in real and reciprocal space. This has little effect on the total energy, but may influence speed of computation in large systems. Note that this parameter is used only when the cutoffs and a are set to None.

  • rcut (float) – Real space cutoff radius dictating how many terms are used in the real space sum.

  • gcut (float) – Reciprocal space cutoff radius.

  • a (float) – The screening parameter that controls the width of the Gaussians. If not provided, a default value of \(\alpha = \sqrt{\pi}\left(\frac{N}{V^2}\right)^{1/6}\) is used. Corresponds to the standard deviation of the Gaussians.

get_matrix(system)[source]

The total energy matrix. Each matrix element (i, j) corresponds to the total interaction energy in a system with atoms i and j.

Parameters

system (ase.Atoms | System) – Input system.

Returns

Ewald matrix.

Return type

np.ndarray

dscribe.descriptors.lmbtr module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.lmbtr.LMBTR(species, periodic, k2=None, k3=None, normalize_gaussians=True, normalization='none', flatten=True, sparse=False)[source]

Bases: dscribe.descriptors.mbtr.MBTR

Implementation of local – per chosen atom – kind of the Many-body tensor representation up to k=3.

Notice that the species of the central atom is not encoded in the output, but is instead represented by a chemical species X with atomic number 0. This allows LMBTR to be also used on general positions not corresponding to real atoms. The surrounding environment is encoded by the two- and three-body interactions with neighouring atoms. If there is a need to distinguish the central species, one can for example train a different model for each central species.

You can choose which terms to include by providing a dictionary in the k2 or k3 arguments. The k1 term is not used in the local version. This dictionary should contain information under three keys: “geometry”, “grid” and “weighting”. See the examples below for how to format these dictionaries.

You can use this descriptor for finite and periodic systems. When dealing with periodic systems or when using machine learning models that use the Euclidean norm to measure distance between vectors, it is advisable to use some form of normalization.

For the geometry functions the following choices are available:

  • \(k=2\):

    • “distance”: Pairwise distance in angstroms.

    • “inverse_distance”: Pairwise inverse distance in 1/angstrom.

  • \(k=3\):

    • “angle”: Angle in degrees.

    • “cosine”: Cosine of the angle.

For the weighting the following functions are available:

  • \(k=2\):

    • “unity”: No weighting.

    • “exp” or “exponential”: Weighting of the form \(e^{-sx}\)

  • \(k=3\):

    • “unity”: No weighting.

    • “exp” or “exponential”: Weighting of the form \(e^{-sx}\)

The exponential weighting is motivated by the exponential decay of screened Coulombic interactions in solids. In the exponential weighting the parameters cutoff determines the value of the weighting function after which the rest of the terms will be ignored and the parameter scale corresponds to \(s\). The meaning of \(x\) changes for different terms as follows:

  • \(k=2\): \(x\) = Distance between A->B

  • \(k=3\): \(x\) = Distance from A->B->C->A.

In the grid setup min is the minimum value of the axis, max is the maximum value of the axis, sigma is the standard deviation of the gaussian broadening and n is the number of points sampled on the grid.

If flatten=False, a list of dense np.ndarrays for each k in ascending order is returned. These arrays are of dimension (n_elements x n_elements x n_grid_points), where the elements are sorted in ascending order by their atomic number.

If flatten=True, a scipy.sparse.coo_matrix is returned. This sparse matrix is of size (1, n_features), where n_features is given by get_number_of_features(). This vector is ordered so that the different k-terms are ordered in ascending order, and within each k-term the distributions at each entry (i, j, h) of the tensor are ordered in an ascending order by (i * n_elements) + (j * n_elements) + (h * n_elements).

This implementation does not support the use of a non-identity correlation matrix.

Parameters
  • species (iterable) – The chemical species as a list of atomic numbers or as a list of chemical symbols. Notice that this is not the atomic numbers that are present for an individual system, but should contain all the elements that are ever going to be encountered when creating the descriptors for a set of systems. Keeping the number of chemical speices as low as possible is preferable.

  • periodic (bool) – Determines whether the system is considered to be periodic.

  • k2 (dict) –

    Dictionary containing the setup for the k=2 term. Contains setup for the used geometry function, discretization and weighting function. For example:

    k2 = {
        "geometry": {"function": "inverse_distance"},
        "grid": {"min": 0.1, "max": 2, "sigma": 0.1, "n": 50},
        "weighting": {"function": "exp", "scale": 0.75, "cutoff": 1e-2}
    }
    

  • k3 (dict) –

    Dictionary containing the setup for the k=3 term. Contains setup for the used geometry function, discretization and weighting function. For example:

    k3 = {
        "geometry": {"function": "angle"},
        "grid": {"min": 0, "max": 180, "sigma": 5, "n": 50},
        "weighting" = {"function": "exp", "scale": 0.5, "cutoff": 1e-3}
    }
    

  • normalize_gaussians (bool) – Determines whether the gaussians are normalized to an area of 1. Defaults to True. If False, the normalization factor is dropped and the gaussians have the form. \(e^{-(x-\mu)^2/2\sigma^2}\)

  • normalization (str) –

    Determines the method for normalizing the output. The available options are:

    • ”none”: No normalization.

    • ”l2_each”: Normalize the Euclidean length of each k-term individually to unity.

  • flatten (bool) – Whether the output should be flattened to a 1D array. If False, a dictionary of the different tensors is provided, containing the values under keys: “k1”, “k2”, and “k3”:

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system, positions=None, n_jobs=1, verbose=False)[source]

Return the LMBTR output for the given systems and given positions.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • positions (list) – Positions where to calculate LMBTR. Can be provided as cartesian positions or atomic indices. If no positions are defined, the LMBTR output will be created for all atoms in the system. When calculating LMBTR for multiple systems, provide the positions as a list for each system.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

The LMBTR output for the given systems and positions. The return type depends on the ‘sparse’-attribute. The first dimension is determined by the amount of positions and systems and the second dimension is determined by the get_number_of_features()-function.

Return type

np.ndarray | scipy.sparse.csr_matrix

create_single(system, positions)[source]

Return the local many-body tensor representation for the given system and positions.

Parameters
  • system (ase.Atoms | System) – Input system.

  • positions (iterable) – Positions or atom index of points, from which local_mbtr is created. Can be a list of integer numbers or a list of xyz-coordinates. If integers provided, the atoms at that index are used as centers. If positions provided, new atoms are added at that position.

Returns

The local many-body tensor representations of given positions, for k terms, as an array. These are ordered as given in positions.

Return type

1D ndarray

get_location(species)[source]

Can be used to query the location of a species combination in the the flattened output.

Parameters
  • species (tuple) – A tuple containing a species combination as

  • symbols or atomic numbers. The central atom is marked as (chemical) –

  • "X". The tuple can be for example (species) –

  • "H")

Returns

slice containing the location of the specified species combination. The location is given as a python slice-object, that can be directly used to target ranges in the output.

Return type

slice

Raises
  • ValueError – If the requested species combination is not in the

  • output or if invalid species defined.

get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

The number of features for the LMBTR is calculated as follows:

For the pair term (k=2), only pairs where at least one of the atom is the central atom (in periodic systems the central atom may connect to itself) are considered. This means that there are only as many combinations as there are different elements to pair the central atom with (n_elem). This nmber of combinations is the multiplied by the discretization of the k=2 grid.

For the three-body term (k=3), only triplets where at least one of the atoms is the central atom (in periodic systems the central atom may connect to itself) and the k >= i (symmetry) are considered. This means that as k runs from 0 to n-1, where n is the number of elements, there are n + k combinations that fill this rule. This sum becomes: \(\sum_{k=0}^{n-1} n + k = n^2+(n-1)*n/2\). This number of combinations is the multiplied by the discretization of the k=3 grid.

Returns

Number of features for this descriptor.

Return type

int

property normalization
property species

dscribe.descriptors.matrixdescriptor module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.matrixdescriptor.MatrixDescriptor(n_atoms_max, permutation='sorted_l2', sigma=None, seed=None, flatten=True, sparse=False)[source]

Bases: dscribe.descriptors.descriptor.Descriptor

A common base class for two-body matrix-like descriptors.

Parameters
  • n_atoms_max (int) – The maximum nuber of atoms that any of the samples can have. This controls how much zeros need to be padded to the final result.

  • permutation (string) –

    Defines the method for handling permutational invariance. Can be one of the following:

    • none: The matrix is returned in the order defined by the Atoms.

    • sorted_l2: The rows and columns are sorted by the L2 norm.

    • eigenspectrum: Only the eigenvalues are returned sorted by their absolute value in descending order.

    • random: The rows and columns are sorted by their L2 norm after applying Gaussian noise to the norms. The standard deviation of the noise is determined by the sigma-parameter.

  • sigma (float) – Provide only when using the random-permutation option. Standard deviation of the gaussian distributed noise determining how much the rows and columns of the randomly sorted matrix are scrambled.

  • seed (int) – Provide only when using the random-permutation option. A seed to use for drawing samples from a normal distribution.

  • flatten (bool) – Whether the output of create() should be flattened to a 1D array.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create_single(system)[source]
Parameters

system (ase.Atoms | System) – Input system.

Returns

The zero padded matrix either as a 2D array or as

a 1D array depending on the setting self._flatten.

Return type

ndarray

get_eigenspectrum(matrix)[source]

Calculates the eigenvalues of the matrix and returns a list of them sorted by their descending absolute value.

Parameters

matrix (np.ndarray) – The matrix to sort.

Returns

A list of eigenvalues sorted by absolute value.

Return type

np.ndarray

abstract get_matrix(system)[source]

Used to get the final matrix for this descriptor.

Parameters

system (ase.Atoms | System) – Input system.

Returns

The final two-dimensional matrix for this descriptor.

Return type

np.ndarray

get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

Returns

Number of features for this descriptor.

Return type

int

sort(matrix)[source]

Sorts the given matrix by using the L2 norm.

Parameters

matrix (np.ndarray) – The matrix to sort.

Returns

The sorted matrix.

Return type

np.ndarray

sort_randomly(matrix, sigma)[source]

Given a coulomb matrix, it adds random noise to the sorting defined by sigma. For sorting, L2-norm is used.

Parameters

matrix (np.ndarray) – The matrix to randomly sort.

sigma:
float: Width of gaussian distributed noise determining how much the

rows and columns of the randomly sorted coulomb matrix are scrambled.

Returns

The randomly sorted matrix.

Return type

np.ndarray

zero_pad(array)[source]

Zero-pads the given matrix.

Parameters

array (np.ndarray) – The array to pad

Returns

The zero-padded array.

Return type

np.ndarray

dscribe.descriptors.mbtr module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.mbtr.MBTR(species, periodic, k1=None, k2=None, k3=None, normalize_gaussians=True, normalization='none', flatten=True, sparse=False)[source]

Bases: dscribe.descriptors.descriptor.Descriptor

Implementation of the Many-body tensor representation up to \(k=3\).

You can choose which terms to include by providing a dictionary in the k1, k2 or k3 arguments. This dictionary should contain information under three keys: “geometry”, “grid” and “weighting”. See the examples below for how to format these dictionaries.

You can use this descriptor for finite and periodic systems. When dealing with periodic systems or when using machine learning models that use the Euclidean norm to measure distance between vectors, it is advisable to use some form of normalization.

For the geometry functions the following choices are available:

  • \(k=1\):

    • “atomic_number”: The atomic numbers.

  • \(k=2\):

    • “distance”: Pairwise distance in angstroms.

    • “inverse_distance”: Pairwise inverse distance in 1/angstrom.

  • \(k=3\):

    • “angle”: Angle in degrees.

    • “cosine”: Cosine of the angle.

For the weighting the following functions are available:

  • \(k=1\):

    • “unity”: No weighting.

  • \(k=2\):

    • “unity”: No weighting.

    • “exp” or “exponential”: Weighting of the form \(e^{-sx}\)

  • \(k=3\):

    • “unity”: No weighting.

    • “exp” or “exponential”: Weighting of the form \(e^{-sx}\)

The exponential weighting is motivated by the exponential decay of screened Coulombic interactions in solids. In the exponential weighting the parameters cutoff determines the value of the weighting function after which the rest of the terms will be ignored and the parameter scale corresponds to \(s\). The meaning of \(x\) changes for different terms as follows:

  • \(k=2\): \(x\) = Distance between A->B

  • \(k=3\): \(x\) = Distance from A->B->C->A.

In the grid setup min is the minimum value of the axis, max is the maximum value of the axis, sigma is the standard deviation of the gaussian broadening and n is the number of points sampled on the grid.

If flatten=False, a list of dense np.ndarrays for each k in ascending order is returned. These arrays are of dimension (n_elements x n_elements x n_grid_points), where the elements are sorted in ascending order by their atomic number.

If flatten=True, a scipy.sparse.coo_matrix is returned. This sparse matrix is of size (1, n_features), where n_features is given by get_number_of_features(). This vector is ordered so that the different k-terms are ordered in ascending order, and within each k-term the distributions at each entry (i, j, h) of the tensor are ordered in an ascending order by (i * n_elements) + (j * n_elements) + (h * n_elements).

This implementation does not support the use of a non-identity correlation matrix.

Parameters
  • species (iterable) – The chemical species as a list of atomic numbers or as a list of chemical symbols. Notice that this is not the atomic numbers that are present for an individual system, but should contain all the elements that are ever going to be encountered when creating the descriptors for a set of systems. Keeping the number of chemical speices as low as possible is preferable.

  • periodic (bool) – Determines whether the system is considered to be periodic.

  • k1 (dict) –

    Setup for the k=1 term. For example:

    k1 = {
        "geometry": {"function": "atomic_number"},
        "grid": {"min": 1, "max": 10, "sigma": 0.1, "n": 50}
    }
    

  • k2 (dict) –

    Dictionary containing the setup for the k=2 term. Contains setup for the used geometry function, discretization and weighting function. For example:

    k2 = {
        "geometry": {"function": "inverse_distance"},
        "grid": {"min": 0.1, "max": 2, "sigma": 0.1, "n": 50},
        "weighting": {"function": "exp", "scale": 0.75, "cutoff": 1e-2}
    }
    

  • k3 (dict) –

    Dictionary containing the setup for the k=3 term. Contains setup for the used geometry function, discretization and weighting function. For example:

    k3 = {
        "geometry": {"function": "angle"},
        "grid": {"min": 0, "max": 180, "sigma": 5, "n": 50},
        "weighting" : {"function": "exp", "scale": 0.5, "cutoff": 1e-3}
    }
    

  • normalize_gaussians (bool) – Determines whether the gaussians are normalized to an area of 1. Defaults to True. If False, the normalization factor is dropped and the gaussians have the form. \(e^{-(x-\mu)^2/2\sigma^2}\)

  • normalization (str) –

    Determines the method for normalizing the output. The available options are:

    • ”none”: No normalization.

    • ”l2_each”: Normalize the Euclidean length of each k-term individually to unity.

    • ”n_atoms”: Normalize the output by dividing it with the number of atoms in the system. If the system is periodic, the number of atoms is determined from the given unit cell.

  • flatten (bool) – Whether the output should be flattened to a 1D array. If False, a dictionary of the different tensors is provided, containing the values under keys: “k1”, “k2”, and “k3”:

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

check_grid(grid)[source]

Used to ensure that the given grid settings are valid.

Parameters

grid (dict) – Dictionary containing the grid setup.

create(system, n_jobs=1, verbose=False)[source]

Return MBTR output for the given systems.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

MBTR for the given systems. The return type depends on the ‘sparse’ and ‘flatten’-attributes. For flattened output a single numpy array or sparse scipy.csr_matrix is returned. The first dimension is determined by the amount of systems. If the output is not flattened, dictionaries containing the MBTR tensors for each k-term are returned.

Return type

np.ndarray | scipy.sparse.csr_matrix | list

create_single(system)[source]

Return the many-body tensor representation for the given system.

Parameters

system (ase.Atoms | System) – Input system.

Returns

The return type is specified by the ‘flatten’ and ‘sparse’-parameters. If the output is not flattened, a dictionary containing of MBTR outputs as numpy arrays is created. Each output is under a “kX” key. If the output is flattened, a single concatenated output vector is returned, either as a sparse or a dense vector.

Return type

dict | np.ndarray | scipy.sparse.coo_matrix

get_k1_axis()[source]

Used to get the discretized axis for geometry function of the k=1 term.

Returns

The discretized axis for the k=1 term.

Return type

np.ndarray

get_k2_axis()[source]

Used to get the discretized axis for geometry function of the k=2 term.

Returns

The discretized axis for the k=2 term.

Return type

np.ndarray

get_k3_axis()[source]

Used to get the discretized axis for geometry function of the k=3 term.

Returns

The discretized axis for the k=3 term.

Return type

np.ndarray

get_location(species)[source]

Can be used to query the location of a species combination in the the flattened output.

Parameters
  • species (tuple) – A tuple containing a species combination as

  • symbols or atomic numbers. The tuple can be for example (chemical) –

  • ("H"), ("H", "O") or ("H", "O", "H") –

Returns

slice containing the location of the specified species combination. The location is given as a python slice-object, that can be directly used to target ranges in the output.

Return type

slice

Raises

ValueError – If the requested species combination is not in the output or if invalid species defined.

get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

Returns

Number of features for this descriptor.

Return type

int

property k1
property k2
property k3
property normalization
property species

dscribe.descriptors.sinematrix module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.sinematrix.SineMatrix(n_atoms_max, permutation='sorted_l2', sigma=None, seed=None, flatten=True, sparse=False)[source]

Bases: dscribe.descriptors.matrixdescriptor.MatrixDescriptor

Calculates the zero padded Sine matrix for different systems.

The Sine matrix is defined as:

Cij = 0.5 Zi**exponent | i = j

= (Zi*Zj)/phi(Ri, Rj) | i != j

where phi(r1, r2) = | B * sum(k = x,y,z)[ek * sin^2(pi * ek * B^-1 (r2-r1))] | (B is the matrix of basis cell vectors, ek are the unit vectors)

The matrix is padded with invisible atoms, which means that the matrix is padded with zeros until the maximum allowed size defined by n_max_atoms is reached.

For reference, see:

“Crystal Structure Representations for Machine Learning Models of Formation Energies”, Felix Faber, Alexander Lindmaa, Anatole von Lilienfeld, and Rickard Armiento, International Journal of Quantum Chemistry, (2015), https://doi.org/10.1002/qua.24917

Parameters
  • n_atoms_max (int) – The maximum nuber of atoms that any of the samples can have. This controls how much zeros need to be padded to the final result.

  • permutation (string) –

    Defines the method for handling permutational invariance. Can be one of the following:

    • none: The matrix is returned in the order defined by the Atoms.

    • sorted_l2: The rows and columns are sorted by the L2 norm.

    • eigenspectrum: Only the eigenvalues are returned sorted by their absolute value in descending order.

    • random: The rows and columns are sorted by their L2 norm after applying Gaussian noise to the norms. The standard deviation of the noise is determined by the sigma-parameter.

  • sigma (float) – Provide only when using the random-permutation option. Standard deviation of the gaussian distributed noise determining how much the rows and columns of the randomly sorted matrix are scrambled.

  • seed (int) – Provide only when using the random-permutation option. A seed to use for drawing samples from a normal distribution.

  • flatten (bool) – Whether the output of create() should be flattened to a 1D array.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system, n_jobs=1, verbose=False)[source]

Return the Sine matrix for the given systems.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

Sine matrix for the given systems. The return type depends on the ‘sparse’ and ‘flatten’-attributes. For flattened output a single numpy array or sparse scipy.csr_matrix is returned. The first dimension is determined by the amount of systems.

Return type

np.ndarray | scipy.sparse.csr_matrix

get_matrix(system)[source]

Creates the Sine matrix for the given system.

Parameters

system (ase.Atoms | System) – Input system.

Returns

Sine matrix as a 2D array.

Return type

np.ndarray

dscribe.descriptors.soap module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.soap.SOAP(rcut, nmax, lmax, sigma=1.0, rbf='gto', species=None, periodic=False, crossover=True, average='off', sparse=False)[source]

Bases: dscribe.descriptors.descriptor.Descriptor

Class for generating a partial power spectrum from Smooth Overlap of Atomic Orbitals (SOAP). This implementation uses real (tesseral) spherical harmonics as the angular basis set and provides two orthonormalized alternatives for the radial basis functions: spherical primitive gaussian type orbitals (“gto”) or the polynomial basis set (“polynomial”).

For reference, see:

“On representing chemical environments, Albert P. Bartók, Risi Kondor, and Gábor Csányi, Phys. Rev. B 87, 184115, (2013), https://doi.org/10.1103/PhysRevB.87.184115

“Comparing molecules and solids across structural and alchemical space”, Sandip De, Albert P. Bartók, Gábor Csányi and Michele Ceriotti, Phys. Chem. Chem. Phys. 18, 13754 (2016), https://doi.org/10.1039/c6cp00415f

“Machine learning hydrogen adsorption on nanoclusters through structural descriptors”, Marc O. J. Jäger, Eiaki V. Morooka, Filippo Federici Canova, Lauri Himanen & Adam S. Foster, npj Comput. Mater., 4, 37 (2018), https://doi.org/10.1038/s41524-018-0096-5

Parameters
  • rcut (float) – A cutoff for local region in angstroms. Should be bigger than 1 angstrom.

  • nmax (int) – The number of radial basis functions.

  • lmax (int) – The maximum degree of spherical harmonics.

  • species (iterable) – The chemical species as a list of atomic numbers or as a list of chemical symbols. Notice that this is not the atomic numbers that are present for an individual system, but should contain all the elements that are ever going to be encountered when creating the descriptors for a set of systems. Keeping the number of chemical species as low as possible is preferable.

  • sigma (float) – The standard deviation of the gaussians used to expand the atomic density.

  • rbf (str) –

    The radial basis functions to use. The available options are:

    • ”gto”: Spherical gaussian type orbitals defined as \(g_{nl}(r) = \sum_{n'=1}^{n_\mathrm{max}}\,\beta_{nn'l} r^l e^{-\alpha_{n'l}r^2}\)

    • ”polynomial”: Polynomial basis defined as \(g_{n}(r) = \sum_{n'=1}^{n_\mathrm{max}}\,\beta_{nn'} (r-r_\mathrm{cut})^{n'+2}\)

  • periodic (bool) – Determines whether the system is considered to be periodic.

  • crossover (bool) – Determines if crossover of atomic types should be included in the power spectrum. If enabled, the power spectrum is calculated over all unique species combinations Z and Z’. If disabled, the power spectrum does not contain cross-species information and is only run over each unique species Z. Turned on by default to correspond to the original definition

  • average (str) –

    The averaging mode over the centers of interest. Valid options are:

    • ”off”: No averaging.

    • ”inner”: Averaging over sites before summing up the magnetic quantum numbers: \(p_{nn'l}^{Z_1,Z_2} \sim \sum_m (\frac{1}{n} \sum_i c_{nlm}^{i, Z_1})^{*} (\frac{1}{n} \sum_i c_{n'lm}^{i, Z_2})\)

    • ”outer”: Averaging over the power spectrum of different sites: \(p_{nn'l}^{Z_1,Z_2} \sim \frac{1}{n} \sum_i \sum_m (c_{nlm}^{i, Z_1})^{*} (c_{n'lm}^{i, Z_2})\)

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system, positions=None, n_jobs=1, verbose=False)[source]

Return the SOAP output for the given systems and given positions.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • positions (list) – Positions where to calculate SOAP. Can be provided as cartesian positions or atomic indices. If no positions are defined, the SOAP output will be created for all atoms in the system. When calculating SOAP for multiple systems, provide the positions as a list for each system.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

The SOAP output for the given systems and positions. The return type depends on the ‘sparse’-attribute. The first dimension is determined by the amount of positions and systems and the second dimension is determined by the get_number_of_features()-function. When multiple systems are provided the results are ordered by the input order of systems and their positions.

Return type

np.ndarray | scipy.sparse.csr_matrix

create_single(system, positions=None)[source]

Return the SOAP output for the given system and given positions.

Parameters
  • system (ase.Atoms | System) – Input system.

  • positions (list) – Cartesian positions or atomic indices. If specified, the SOAP spectrum will be created for these points. If no positions are defined, the SOAP output will be created for all atoms in the system.

Returns

The SOAP output for the given system and positions. The return type depends on the ‘sparse’-attribute. The first dimension is given by the number of positions and the second dimension is determined by the get_number_of_features()-function.

Return type

np.ndarray | scipy.sparse.coo_matrix

flatten_positions(system, atomic_numbers=None)[source]

Takes an ase Atoms object and returns flattened numpy arrays for the C-extension to use.

Parameters
  • system (ase.atoms) – The system to convert.

  • atomic_numbers() – The atomic numbers to consider. Atoms that do not have these atomic numbers are ignored.

Returns

Returns the positions flattened and sorted by atomic number, atomic numbers flattened and sorted by atomic number, number of different species and the sorted set of atomic numbers.

Return type

(np.ndarray, list, int, np.ndarray)

get_basis_gto(rcut, nmax)[source]

Used to calculate the alpha and beta prefactors for the gto-radial basis.

Parameters
  • rcut (float) – Radial cutoff.

  • nmax (int) – Number of gto radial bases.

Returns

The alpha and beta prefactors for all bases up to a fixed size of l=10.

Return type

(np.ndarray, np.ndarray)

get_basis_poly(rcut, nmax)[source]

Used to calculate discrete vectors for the polynomial basis functions.

Parameters
  • rcut (float) – Radial cutoff.

  • nmax (int) – Number of polynomial radial bases.

Returns

Tuple containing the evaluation points in radial direction as the first item, and the corresponding orthonormalized polynomial radial basis set as the second item.

Return type

(np.ndarray, np.ndarray)

get_location(species)[source]

Can be used to query the location of a species combination in the the flattened output.

Parameters
  • species (tuple) – A tuple containing a pair of species as chemical

  • or atomic numbers. The tuple can be for example (symbols) –

Returns

slice containing the location of the specified species combination. The location is given as a python slice-object, that can be directly used to target ranges in the output.

Return type

slice

Raises

ValueError – If the requested species combination is not in the output or if invalid species defined.

get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

Returns

Number of features for this descriptor.

Return type

int

get_soap_locals_gto(system, centers, alphas, betas, rcut, cutoff_padding, nmax, lmax, eta, crossover, average, atomic_numbers=None)[source]

Get the SOAP output for the given positions using the gto radial basis.

Parameters
  • system (ase.Atoms) – Atomic structure for which the SOAP output is calculated.

  • centers (np.ndarray) – Positions at which to calculate SOAP.

  • alphas (np.ndarray) – The alpha coeffients for the gto-basis.

  • betas (np.ndarray) – The beta coeffients for the gto-basis.

  • rcut (float) – Radial cutoff.

  • cutoff_padding (float) – The padding that is added for including atoms beyond the cutoff.

  • nmax (int) – Maximum number of radial basis functions.

  • lmax (int) – Maximum spherical harmonics degree.

  • eta (float) – The gaussian smearing width.

  • crossover (bool) – Whether to include species crossover in output.

  • atomic_numbers (np.ndarray) – Can be used to specify the species for which to calculate the output. If None, all species are included. If given the output is calculated only for the given species and is ordered by atomic number.

Returns

SOAP output with the gto radial basis for the given positions.

Return type

np.ndarray

get_soap_locals_poly(system, centers, rcut, cutoff_padding, nmax, lmax, eta, crossover, average, atomic_numbers=None)[source]

Get the SOAP output using polynomial radial basis for the given positions. :param system: Atomic structure for which the SOAP output is

calculated.

Parameters
  • centers (np.ndarray) – Positions at which to calculate SOAP.

  • alphas (np.ndarray) – The alpha coeffients for the gto-basis.

  • betas (np.ndarray) – The beta coeffients for the gto-basis.

  • rCut (float) – Radial cutoff.

  • cutoff_padding (float) – The padding that is added for including atoms beyond the cutoff.

  • nmax (int) – Maximum number of radial basis functions.

  • lmax (int) – Maximum spherical harmonics degree.

  • eta (float) – The gaussian smearing width.

  • crossover (bool) – Whether to include species crossover in output.

  • atomic_numbers (np.ndarray) – Can be used to specify the species for which to calculate the output. If None, all species are included. If given the output is calculated only for the given species and is ordered by atomic number.

Returns

SOAP output with the polynomial radial basis for the given positions.

Return type

np.ndarray

property species

Module contents

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.