dscribe.descriptors.sinematrix module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.sinematrix.SineMatrix(n_atoms_max, permutation='sorted_l2', sigma=None, seed=None, flatten=True, sparse=False)[source]

Bases: dscribe.descriptors.matrixdescriptor.MatrixDescriptor

Calculates the zero padded Sine matrix for different systems.

The Sine matrix is defined as:

Cij = 0.5 Zi**exponent | i = j

= (Zi*Zj)/phi(Ri, Rj) | i != j

where phi(r1, r2) = | B * sum(k = x,y,z)[ek * sin^2(pi * ek * B^-1 (r2-r1))] | (B is the matrix of basis cell vectors, ek are the unit vectors)

The matrix is padded with invisible atoms, which means that the matrix is padded with zeros until the maximum allowed size defined by n_max_atoms is reached.

For reference, see:

“Crystal Structure Representations for Machine Learning Models of Formation Energies”, Felix Faber, Alexander Lindmaa, Anatole von Lilienfeld, and Rickard Armiento, International Journal of Quantum Chemistry, (2015), https://doi.org/10.1002/qua.24917

Parameters
  • n_atoms_max (int) – The maximum nuber of atoms that any of the samples can have. This controls how much zeros need to be padded to the final result.

  • permutation (string) –

    Defines the method for handling permutational invariance. Can be one of the following:

    • none: The matrix is returned in the order defined by the Atoms.

    • sorted_l2: The rows and columns are sorted by the L2 norm.

    • eigenspectrum: Only the eigenvalues are returned sorted by their absolute value in descending order.

    • random: The rows and columns are sorted by their L2 norm after applying Gaussian noise to the norms. The standard deviation of the noise is determined by the sigma-parameter.

  • sigma (float) – Provide only when using the random-permutation option. Standard deviation of the gaussian distributed noise determining how much the rows and columns of the randomly sorted matrix are scrambled.

  • seed (int) – Provide only when using the random-permutation option. A seed to use for drawing samples from a normal distribution.

  • flatten (bool) – Whether the output of create() should be flattened to a 1D array.

  • sparse (bool) – Whether the output should be a sparse matrix or a dense numpy array.

create(system, n_jobs=1, only_physical_cores=False, verbose=False)[source]

Return the Sine matrix for the given systems.

Parameters
  • system (ase.Atoms or list of ase.Atoms) – One or many atomic structures.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1. If a negative number is given, the used cpus will be calculated with, n_cpus + n_jobs, where n_cpus is the amount of CPUs as reported by the OS. With only_physical_cores you can control which types of CPUs are counted in n_cpus.

  • only_physical_cores (bool) – If a negative n_jobs is given, determines which types of CPUs are used in calculating the number of jobs. If set to False (default), also virtual CPUs are counted. If set to True, only physical CPUs are counted.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

Returns

Sine matrix for the given systems. The return type depends on the ‘sparse’ and ‘flatten’-attributes. For flattened output a single numpy array or sparse.COO is returned. The first dimension is determined by the amount of systems.

Return type

np.ndarray | sparse.COO

get_matrix(system)[source]

Creates the Sine matrix for the given system.

Parameters

system (ase.Atoms | System) – Input system.

Returns

Sine matrix as a 2D array.

Return type

np.ndarray