Collections of kernel matrices¶
When dealing multiple kernel matrices e.g. as part of multiple kernel learning (MKL), a number of validation and sanity checks need to be performed. Some of these checks include ensuring compatible size of the kernel matrices (KMs), as well as knowing these matrices are generated from the same sample. We refer to such collection of KMs a KernelSet
. Moreover, accessing a subset of these KMs e.g. filtered by some metric is often necessary while trying to optimize algorithms like MKL. To serve as candidates for optimization, it is often necessary to sample and generate a large number of KMs, here referred to as a bucket. The KernelSet
and KernelBucket
make these tasks easy and extensible while keeping their rich annotations and structure (meta-data etc). Their API along with a custom exception are shown below.
Kernel Set¶
-
class
kernelmethods.
KernelSet
(km_list=None, name='KernelSet', num_samples=None)[source]¶ Bases:
object
Container class to manage a set of compatible KernelMatrix instances.
Compatibility is checked based on the size (number of samples they operate on). Provides methods to iterate over the KMs, access a subset and query the underlying kernel funcs.
-
append
(KM)[source]¶ Method to add a new kernel to the set.
Checks to ensure the new KM is compatible in size to the existing set.
- Parameters
KM (KernelMatrix or ndarray or compatible) – kernel matrix to be appended to the KernelSet
-
attach_to
(sample, name='sample', attr_name=None, attr_value=None)[source]¶ Attach all the kernel matrices in this set to a given sample.
Any previous evaluations to other samples and their results will be reset.
- Parameters
sample (ndarray) – Input sample to operate on Must be 2D of shape (num_samples, num_features)
name (str) – Identifier for the sample (esp. when multiple are in the same set)
-
get_attr
(name, value_if_not_found=None)[source]¶ Returns the value of an user-defined attribute.
If not set previously, or no match found, returns value_if_not_found.
- Parameters
attr_name (str or hashable) –
value_if_not_found (object) – If attribute was not set previously, returns this value
- Returns
attr_values – Values of the attribute from each KM in the set. Or value_if_not_found if attribute is not found.
- Return type
object
-
get_kernel_funcs
(indices)[source]¶ Returns kernel functions underlying the specified kernel matrices in this kernel set.
This is helpful to apply a given set of kernel functions on new sets of data (e.g. test set)
- Parameters
indices (Iterable) – List of indices identifying the kernel matrices to return
- Returns
kf_tuple – Tuple of kernel functinons from the selected KMs
- Return type
tuple
-
property
num_samples
¶ Number of samples in each individual kernel matrix
-
set_attr
(name, values)[source]¶ Sets user-defined attributes for the kernel matrices in this set.
If len(values)==1, same value is set for all. Otherwise values must be of size as KernelSet, providing a separate value for each element.
Useful to identify this kernel matrix in various aspects! You could think of them as tags or identifiers etc. As they are user-defined, they are ideal to represent user needs and applications.
- Parameters
name (str or hashable) – Names of the attribute.
values (object) – Value of the attribute
-
property
size
¶ Number of kernel matrices in this set
-
take
(indices, name='SelectedKMs')[source]¶ “Returns a new KernelSet with requested kernel matrices, identified by their indices.
- Parameters
indices (Iterable) – List of indices identifying the kernel matrices to return
name (str) – Name for the new kernel set.
- Returns
ks – New kernel set with the selected KMs
- Return type
-
Kernel Bucket¶
-
class
kernelmethods.
KernelBucket
(poly_degree_values=(2, 3, 4), rbf_sigma_values=(0.03125, 0.125, 0.5, 2, 8, 32), laplace_gamma_values=(2, 8, 32), name='KernelBucket', normalize_kernels=True, skip_input_checks=False)[source]¶ Bases:
kernelmethods.base.KernelSet
Class to generate and/or maintain a “bucket” of candidate kernels.
Applications:
to rank/filter/select kernels based on a given sample via many metrics
to be defined.
Note: 1. Linear kernel is always added during init without your choosing. 2. This is in contrast to Chi^2 kernel, which is not added to the bucket by default, as it requires positive feature values and may break default use for common applications. You can easily add Chi^2 or any other kernels via the
add_parametrized_kernels
method.- Parameters
poly_degree_values (Iterable) – List of values for the degree parameter of the PolyKernel. One KernelMatrix will be added to the bucket for each value.
rbf_sigma_values (Iterable) – List of values for the sigma parameter of the GaussianKernel. One KernelMatrix will be added to the bucket for each value.
laplace_gamma_values (Iterable) – List of values for the gamma parameter of the LaplacianKernel. One KernelMatrix will be added to the bucket for each value.
name (str) – String to identify the purpose or type of the bucket of kernels. Also helps easily distinguishing it from other buckets.
normalize_kernels (bool) – Flag to indicate whether the kernel matrices need to be normalized
skip_input_checks (bool) – Flag to indicate whether checks on input data (type, format etc) can be skipped. This helps save a tiny bit of runtime for expert uses when data types and formats are managed thoroughly in numpy. Default: False. Disable this only when you know exactly what you’re doing!
-
add_parametrized_kernels
(kernel_func, param, values)[source]¶ Adds a list of kernels parametrized by various values for a given param
- Parameters
kernel_func (BaseKernelFunction) – Kernel function to be added (not an instance, but callable class)
param (str) – Name of the parameter to the above kernel function
values (Iterable) – List of parameter values. One kernel will be added for each value
-
append
(KM)¶ Method to add a new kernel to the set.
Checks to ensure the new KM is compatible in size to the existing set.
- Parameters
KM (KernelMatrix or ndarray or compatible) – kernel matrix to be appended to the KernelSet
-
attach_to
(sample, name='sample', attr_name=None, attr_value=None)¶ Attach all the kernel matrices in this set to a given sample.
Any previous evaluations to other samples and their results will be reset.
- Parameters
sample (ndarray) – Input sample to operate on Must be 2D of shape (num_samples, num_features)
name (str) – Identifier for the sample (esp. when multiple are in the same set)
-
extend
(another_km_set)¶ Extends the current set by adding in all elements from another set.
-
get_attr
(name, value_if_not_found=None)¶ Returns the value of an user-defined attribute.
If not set previously, or no match found, returns value_if_not_found.
- Parameters
attr_name (str or hashable) –
value_if_not_found (object) – If attribute was not set previously, returns this value
- Returns
attr_values – Values of the attribute from each KM in the set. Or value_if_not_found if attribute is not found.
- Return type
object
-
get_kernel_funcs
(indices)¶ Returns kernel functions underlying the specified kernel matrices in this kernel set.
This is helpful to apply a given set of kernel functions on new sets of data (e.g. test set)
- Parameters
indices (Iterable) – List of indices identifying the kernel matrices to return
- Returns
kf_tuple – Tuple of kernel functinons from the selected KMs
- Return type
tuple
-
property
num_samples
¶ Number of samples in each individual kernel matrix
-
set_attr
(name, values)¶ Sets user-defined attributes for the kernel matrices in this set.
If len(values)==1, same value is set for all. Otherwise values must be of size as KernelSet, providing a separate value for each element.
Useful to identify this kernel matrix in various aspects! You could think of them as tags or identifiers etc. As they are user-defined, they are ideal to represent user needs and applications.
- Parameters
name (str or hashable) – Names of the attribute.
values (object) – Value of the attribute
-
property
size
¶ Number of kernel matrices in this set
-
take
(indices, name='SelectedKMs')¶ “Returns a new KernelSet with requested kernel matrices, identified by their indices.
- Parameters
indices (Iterable) – List of indices identifying the kernel matrices to return
name (str) – Name for the new kernel set.
- Returns
ks – New kernel set with the selected KMs
- Return type