Collections of kernel matrices

When dealing multiple kernel matrices e.g. as part of multiple kernel learning (MKL), a number of validation and sanity checks need to be performed. Some of these checks include ensuring compatible size of the kernel matrices (KMs), as well as knowing these matrices are generated from the same sample. We refer to such collection of KMs a KernelSet. Moreover, accessing a subset of these KMs e.g. filtered by some metric is often necessary while trying to optimize algorithms like MKL. To serve as candidates for optimization, it is often necessary to sample and generate a large number of KMs, here referred to as a bucket. The KernelSet and KernelBucket make these tasks easy and extensible while keeping their rich annotations and structure (meta-data etc). Their API along with a custom exception are shown below.

Kernel Set

class kernelmethods.KernelSet(km_list=None, name='KernelSet', num_samples=None)[source]

Bases: object

Container class to manage a set of compatible KernelMatrix instances.

Compatibility is checked based on the size (number of samples they operate on). Provides methods to iterate over the KMs, access a subset and query the underlying kernel funcs.

append(KM)[source]

Method to add a new kernel to the set.

Checks to ensure the new KM is compatible in size to the existing set.

Parameters

KM (KernelMatrix or ndarray or compatible) – kernel matrix to be appended to the KernelSet

attach_to(sample, name='sample', attr_name=None, attr_value=None)[source]

Attach all the kernel matrices in this set to a given sample.

Any previous evaluations to other samples and their results will be reset.

Parameters
  • sample (ndarray) – Input sample to operate on Must be 2D of shape (num_samples, num_features)

  • name (str) – Identifier for the sample (esp. when multiple are in the same set)

extend(another_km_set)[source]

Extends the current set by adding in all elements from another set.

get_attr(name, value_if_not_found=None)[source]

Returns the value of an user-defined attribute.

If not set previously, or no match found, returns value_if_not_found.

Parameters
  • attr_name (str or hashable) –

  • value_if_not_found (object) – If attribute was not set previously, returns this value

Returns

attr_values – Values of the attribute from each KM in the set. Or value_if_not_found if attribute is not found.

Return type

object

get_kernel_funcs(indices)[source]

Returns kernel functions underlying the specified kernel matrices in this kernel set.

This is helpful to apply a given set of kernel functions on new sets of data (e.g. test set)

Parameters

indices (Iterable) – List of indices identifying the kernel matrices to return

Returns

kf_tuple – Tuple of kernel functinons from the selected KMs

Return type

tuple

property num_samples

Number of samples in each individual kernel matrix

set_attr(name, values)[source]

Sets user-defined attributes for the kernel matrices in this set.

If len(values)==1, same value is set for all. Otherwise values must be of size as KernelSet, providing a separate value for each element.

Useful to identify this kernel matrix in various aspects! You could think of them as tags or identifiers etc. As they are user-defined, they are ideal to represent user needs and applications.

Parameters
  • name (str or hashable) – Names of the attribute.

  • values (object) – Value of the attribute

property size

Number of kernel matrices in this set

take(indices, name='SelectedKMs')[source]

“Returns a new KernelSet with requested kernel matrices, identified by their indices.

Parameters
  • indices (Iterable) – List of indices identifying the kernel matrices to return

  • name (str) – Name for the new kernel set.

Returns

ks – New kernel set with the selected KMs

Return type

KernelSet

Kernel Bucket

class kernelmethods.KernelBucket(poly_degree_values=(2, 3, 4), rbf_sigma_values=(0.03125, 0.125, 0.5, 2, 8, 32), laplace_gamma_values=(2, 8, 32), name='KernelBucket', normalize_kernels=True, skip_input_checks=False)[source]

Bases: kernelmethods.base.KernelSet

Class to generate and/or maintain a “bucket” of candidate kernels.

Applications:

  1. to rank/filter/select kernels based on a given sample via many metrics

  2. to be defined.

Note: 1. Linear kernel is always added during init without your choosing. 2. This is in contrast to Chi^2 kernel, which is not added to the bucket by default, as it requires positive feature values and may break default use for common applications. You can easily add Chi^2 or any other kernels via the add_parametrized_kernels method.

Parameters
  • poly_degree_values (Iterable) – List of values for the degree parameter of the PolyKernel. One KernelMatrix will be added to the bucket for each value.

  • rbf_sigma_values (Iterable) – List of values for the sigma parameter of the GaussianKernel. One KernelMatrix will be added to the bucket for each value.

  • laplace_gamma_values (Iterable) – List of values for the gamma parameter of the LaplacianKernel. One KernelMatrix will be added to the bucket for each value.

  • name (str) – String to identify the purpose or type of the bucket of kernels. Also helps easily distinguishing it from other buckets.

  • normalize_kernels (bool) – Flag to indicate whether the kernel matrices need to be normalized

  • skip_input_checks (bool) – Flag to indicate whether checks on input data (type, format etc) can be skipped. This helps save a tiny bit of runtime for expert uses when data types and formats are managed thoroughly in numpy. Default: False. Disable this only when you know exactly what you’re doing!

add_parametrized_kernels(kernel_func, param, values)[source]

Adds a list of kernels parametrized by various values for a given param

Parameters
  • kernel_func (BaseKernelFunction) – Kernel function to be added (not an instance, but callable class)

  • param (str) – Name of the parameter to the above kernel function

  • values (Iterable) – List of parameter values. One kernel will be added for each value

append(KM)

Method to add a new kernel to the set.

Checks to ensure the new KM is compatible in size to the existing set.

Parameters

KM (KernelMatrix or ndarray or compatible) – kernel matrix to be appended to the KernelSet

attach_to(sample, name='sample', attr_name=None, attr_value=None)

Attach all the kernel matrices in this set to a given sample.

Any previous evaluations to other samples and their results will be reset.

Parameters
  • sample (ndarray) – Input sample to operate on Must be 2D of shape (num_samples, num_features)

  • name (str) – Identifier for the sample (esp. when multiple are in the same set)

extend(another_km_set)

Extends the current set by adding in all elements from another set.

get_attr(name, value_if_not_found=None)

Returns the value of an user-defined attribute.

If not set previously, or no match found, returns value_if_not_found.

Parameters
  • attr_name (str or hashable) –

  • value_if_not_found (object) – If attribute was not set previously, returns this value

Returns

attr_values – Values of the attribute from each KM in the set. Or value_if_not_found if attribute is not found.

Return type

object

get_kernel_funcs(indices)

Returns kernel functions underlying the specified kernel matrices in this kernel set.

This is helpful to apply a given set of kernel functions on new sets of data (e.g. test set)

Parameters

indices (Iterable) – List of indices identifying the kernel matrices to return

Returns

kf_tuple – Tuple of kernel functinons from the selected KMs

Return type

tuple

property num_samples

Number of samples in each individual kernel matrix

set_attr(name, values)

Sets user-defined attributes for the kernel matrices in this set.

If len(values)==1, same value is set for all. Otherwise values must be of size as KernelSet, providing a separate value for each element.

Useful to identify this kernel matrix in various aspects! You could think of them as tags or identifiers etc. As they are user-defined, they are ideal to represent user needs and applications.

Parameters
  • name (str or hashable) – Names of the attribute.

  • values (object) – Value of the attribute

property size

Number of kernel matrices in this set

take(indices, name='SelectedKMs')

“Returns a new KernelSet with requested kernel matrices, identified by their indices.

Parameters
  • indices (Iterable) – List of indices identifying the kernel matrices to return

  • name (str) – Name for the new kernel set.

Returns

ks – New kernel set with the selected KMs

Return type

KernelSet

Exceptions

class kernelmethods.KMSetAdditionError[source]

Bases: kernelmethods.config.KernelMethodsException

Exception to indicate invalid addition of kernel matrix to a KernelSet

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.