dscribe.descriptors.descriptor module¶
Copyright 2019 DScribe developers
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
- class dscribe.descriptors.descriptor.Descriptor(periodic, flatten, sparse, dtype='float64')[source]¶
Bases:
abc.ABC
An abstract base class for all descriptors.
- Parameters
flatten (bool) – Whether the output of create() should be flattened to a 1D array.
- check_atomic_numbers(atomic_numbers)[source]¶
Used to check that the given atomic numbers have been defined for this descriptor.
- Parameters
species (iterable) – Atomic numbers to check.
- Raises
ValueError – If the atomic numbers in the given system are not
included in the species given to this descriptor. –
- abstract create(system, *args, **kwargs)[source]¶
Creates the descriptor for the given systems.
- Parameters
system (ase.Atoms) – The system for which to create the descriptor.
args – Descriptor specific positional arguments.
kwargs – Descriptor specific keyword arguments.
- Returns
A descriptor for the system.
- Return type
np.array | scipy.sparse.coo_matrix
- create_parallel(inp, func, n_jobs, static_size=None, only_physical_cores=False, verbose=False, prefer='processes')[source]¶
Used to parallelize the descriptor creation across multiple systems.
- Parameters
inp (list) – Contains a tuple of input arguments for each processed system. These arguments are fed to the function specified by “func”.
func (function) – Function that outputs the descriptor when given input arguments from “inp”.
n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1. If a negative number is given, the number of jobs will be calculated with, n_cpus + n_jobs, where n_cpus is the amount of CPUs as reported by the OS. With only_physical_cores you can control which types of CPUs are counted in n_cpus.
output_sizes (list of ints) – The size of the output for each job. Makes the creation faster by preallocating the correct amount of memory beforehand. If not specified, a dynamically created list of outputs is used.
only_physical_cores (bool) – If a negative n_jobs is given, determines which types of CPUs are used in calculating the number of jobs. If set to False (default), also virtual CPUs are counted. If set to True, only physical CPUs are counted.
verbose (bool) – Controls whether to print the progress of each job into to the console.
prefer (str) –
The parallelization method. Valid options are:
”processes”: Parallelization based on processes. Uses the “loky” backend in joblib to serialize the jobs and run them in separate processes. Using separate processes has a bigger memory and initialization overhead than threads, but may provide better scalability if perfomance is limited by the Global Interpreter Lock (GIL).
”threads”: Parallelization based on threads. Has bery low memory and initialization overhead. Performance is limited by the amount of pure python code that needs to run. Ideal when most of the calculation time is used by C/C++ extensions that release the GIL.
- Returns
The descriptor output for each given input. The return type depends on the desciptor setup.
- Return type
np.ndarray | sparse.COO | list
- derivatives_parallel(inp, func, n_jobs, derivatives_shape, descriptor_shape, return_descriptor, only_physical_cores=False, verbose=False, prefer='processes')[source]¶
Used to parallelize the descriptor creation across multiple systems.
- Parameters
inp (list) – Contains a tuple of input arguments for each processed system. These arguments are fed to the function specified by “func”.
func (function) – Function that outputs the descriptor when given input arguments from “inp”.
n_jobs (int) – Number of parallel jobs to instantiate. Parallellizes the calculation across samples. Defaults to serial calculation with n_jobs=1. If a negative number is given, the number of jobs will be calculated with, n_cpus + n_jobs, where n_cpus is the amount of CPUs as reported by the OS. With only_physical_cores you can control which types of CPUs are counted in n_cpus.
derivatives_shape (list or None) – If a fixed size output is produced from each job, this contains its shape. For variable size output this parameter is set to None
derivatives_shape – If a fixed size output is produced from each job, this contains its shape. For variable size output this parameter is set to None
only_physical_cores (bool) – If a negative n_jobs is given, determines which types of CPUs are used in calculating the number of jobs. If set to False (default), also virtual CPUs are counted. If set to True, only physical CPUs are counted.
verbose (bool) – Controls whether to print the progress of each job into to the console.
prefer (str) –
The parallelization method. Valid options are:
”processes”: Parallelization based on processes. Uses the “loky” backend in joblib to serialize the jobs and run them in separate processes. Using separate processes has a bigger memory and initialization overhead than threads, but may provide better scalability if perfomance is limited by the Global Interpreter Lock (GIL).
”threads”: Parallelization based on threads. Has bery low memory and initialization overhead. Performance is limited by the amount of pure python code that needs to run. Ideal when most of the calculation time is used by C/C++ extensions that release the GIL.
- Returns
The descriptor output for each given input. The return type depends on the desciptor setup.
- Return type
np.ndarray | sparse.COO | list
- property flatten¶
- abstract get_number_of_features()[source]¶
Used to inquire the final number of features that this descriptor will have.
- Returns
Number of features for this descriptor.
- Return type
- get_system(system)[source]¶
Used to convert the given atomic system into a custom System-object that is used internally. The System class inherits from ase.Atoms, but includes built-in caching for geometric quantities that may be re-used by the descriptors.
- property periodic¶
- property sparse¶