dscribe.descriptors package

Submodules

dscribe.descriptors.acsf module

dscribe.descriptors.coulombmatrix module

dscribe.descriptors.descriptor module

Copyright 2019 DScribe developers

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class dscribe.descriptors.descriptor.Descriptor(periodic, flatten, sparse, dtype='float64')[source]

Bases: ABC

An abstract base class for all descriptors.

Parameters

flatten (bool) – Whether the output of create() should be flattened to a 1D array.

check_atomic_numbers(atomic_numbers)[source]

Used to check that the given atomic numbers have been defined for this descriptor.

Parameters

species (iterable) – Atomic numbers to check.

Raises
  • ValueError – If the atomic numbers in the given system are not

  • included in the species given to this descriptor.

abstract create(system, *args, **kwargs)[source]

Creates the descriptor for the given systems.

Parameters
  • system (ase.Atoms) – The system for which to create the descriptor.

  • args – Descriptor specific positional arguments.

  • kwargs – Descriptor specific keyword arguments.

Returns

A descriptor for the system.

Return type

np.array | scipy.sparse.coo_matrix

create_parallel(inp, func, n_jobs, static_size=None, only_physical_cores=False, verbose=False, prefer='processes')[source]

Used to parallelize the descriptor creation across multiple systems.

Parameters
  • inp (list) – Contains a tuple of input arguments for each processed system. These arguments are fed to the function specified by “func”.

  • func (function) – Function that outputs the descriptor when given input arguments from “inp”.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1. If a negative number is given, the number of jobs will be calculated with, n_cpus + n_jobs, where n_cpus is the amount of CPUs as reported by the OS. With only_physical_cores you can control which types of CPUs are counted in n_cpus.

  • output_sizes (list of ints) – The size of the output for each job. Makes the creation faster by preallocating the correct amount of memory beforehand. If not specified, a dynamically created list of outputs is used.

  • only_physical_cores (bool) – If a negative n_jobs is given, determines which types of CPUs are used in calculating the number of jobs. If set to False (default), also virtual CPUs are counted. If set to True, only physical CPUs are counted.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

  • prefer (str) –

    The parallelization method. Valid options are:

    • ”processes”: Parallelization based on processes. Uses the “loky” backend in joblib to serialize the jobs and run them in separate processes. Using separate processes has a bigger memory and initialization overhead than threads, but may provide better scalability if perfomance is limited by the Global Interpreter Lock (GIL).

    • ”threads”: Parallelization based on threads. Has bery low memory and initialization overhead. Performance is limited by the amount of pure python code that needs to run. Ideal when most of the calculation time is used by C/C++ extensions that release the GIL.

Returns

The descriptor output for each given input. The return type depends on the desciptor setup.

Return type

np.ndarray | sparse.COO | list

derivatives_parallel(inp, func, n_jobs, derivatives_shape, descriptor_shape, return_descriptor, only_physical_cores=False, verbose=False, prefer='processes')[source]

Used to parallelize the descriptor creation across multiple systems.

Parameters
  • inp (list) – Contains a tuple of input arguments for each processed system. These arguments are fed to the function specified by “func”.

  • func (function) – Function that outputs the descriptor when given input arguments from “inp”.

  • n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1. If a negative number is given, the number of jobs will be calculated with, n_cpus + n_jobs, where n_cpus is the amount of CPUs as reported by the OS. With only_physical_cores you can control which types of CPUs are counted in n_cpus.

  • derivatives_shape (list or None) – If a fixed size output is produced from each job, this contains its shape. For variable size output this parameter is set to None

  • derivatives_shape – If a fixed size output is produced from each job, this contains its shape. For variable size output this parameter is set to None

  • only_physical_cores (bool) – If a negative n_jobs is given, determines which types of CPUs are used in calculating the number of jobs. If set to False (default), also virtual CPUs are counted. If set to True, only physical CPUs are counted.

  • verbose (bool) – Controls whether to print the progress of each job into to the console.

  • prefer (str) –

    The parallelization method. Valid options are:

    • ”processes”: Parallelization based on processes. Uses the “loky” backend in joblib to serialize the jobs and run them in separate processes. Using separate processes has a bigger memory and initialization overhead than threads, but may provide better scalability if perfomance is limited by the Global Interpreter Lock (GIL).

    • ”threads”: Parallelization based on threads. Has bery low memory and initialization overhead. Performance is limited by the amount of pure python code that needs to run. Ideal when most of the calculation time is used by C/C++ extensions that release the GIL.

Returns

The descriptor output for each given input. The return type depends on the desciptor setup.

Return type

np.ndarray | sparse.COO | list

property flatten
abstract get_number_of_features()[source]

Used to inquire the final number of features that this descriptor will have.

Returns

Number of features for this descriptor.

Return type

int

property periodic
property sparse

dscribe.descriptors.ewaldsummatrix module

dscribe.descriptors.lmbtr module

dscribe.descriptors.matrixdescriptor module

dscribe.descriptors.mbtr module

dscribe.descriptors.sinematrix module

dscribe.descriptors.soap module

dscribe.descriptors.valleoganov module

Module contents