KernelMatrix class

KernelMatrix is a self-contained class for the Gram matrix induced by a kernel function on a given sample. This class defines the central data structure for all kernel methods, as it acts a key bridge between input data space and the learning algorithms.

The class is designed in such a way that

  • computation is on-demand, meaning it only computes elements of the kernel matrix (KM) as needed, and nothing more. This can save a lot computation and storage, for large samples.

  • supports both callable as well as attribute access. This allows easy access to partial or random portions of the KM. Indexing the KM is aimed to be compliant that of numpy arrays as much as possible.

  • allows parallel computation of different part of the KM to speed up computation when N is large

  • allows setting of user-defined attributes to allow easy identification and differentiation among a collection of KMs when working in applications such as Multiple Kernel Learning (MKL)

  • implements basic operations such as centering and normalization (whose implementation differs from that of manipulating regular matrices)

  • exposes several convenience attributes to make advanced development a breeze


This library also provides the following convenience wrappers (see below for the API):

  • KernelMatrixPrecomputed turns a precomputed kernel matrix into a KernelMatrix class with all its attractive properties

  • ConstantKernelMatrix that defines a KernelMatrix with a constant everywhere

  • custom exception KMAccessError to identify invalid access of the KM

API for KernelMatrix

class kernelmethods.KernelMatrix(kernel, normalized=True, name='KernelMatrix')[source]

KernelMatrix is a self-contained class for the Gram matrix induced by a kernel function on a sample.

KernelMatrix behaves just like numpy arrays in terms of accessing its elements:

KM[i,j] –> kernel function between samples i and j

KM[set_i,set_j] where len(set_i)=m and len(set_i)=n returns a matrix KM of size m x n, where KM_ij = kernel between samples set_i(i) and set_j(j)

Parameters
  • kernel (BaseKernelFunction) – kernel function that populates the kernel matrix

  • normalized (bool) – Flag to indicate whether to normalize the kernel matrix Normalization is recommended, unless you have clear reasons not to.

  • name (str) – short name to describe the nature of the kernel function

attach_to(sample_one, name_one='sample', sample_two=None, name_two=None)[source]

Attach this kernel to a given sample.

Any computations from previous samples and their results will be reset, along with all the previously set attributes.

Parameters
  • sample_one (ndarray) – Input sample to operate on Must be a 2D dataset of shape (num_samples, num_features) e.g. MLDataset or ndarray When sample_two=None (e.g. during training), sample_two refers to sample_one.

  • name_one (str) – Name for the first sample.

  • sample_two (ndarray) – Second sample for the kernel matrix i.e. Y in K(X,Y) Must be a 2D dataset of shape (num_samples, num_features) e.g. MLDataset or ndarray The dimensionality of this sample (number of columns, sample_two.shape[1]) must match with that of sample_one

  • name_two (str) – Name for the second sample.

attributes()[source]

Returns all the attributes currently set.

Returns

attributes – Dict of the all the attributes currently set.

Return type

dict

center()[source]

Method to center the kernel matrix

Returns

Return type

None

Raises

NotImplementedError – If the KM is attached two separate samples. Centering a KM is possible only when attached to a single sample.

property centered

Exposes the centered version of the kernel matrix

diagonal()[source]

Returns the diagonal of the kernel matrix, when attached to a single sample.

Raises

ValueError – When this instance is attached to more than one sample

property frob_norm

Returns the Frobenius norm of the current kernel matrix

property full

Fully populated kernel matrix in dense ndarray format.

property full_sparse

Kernel matrix populated in upper tri in sparse array format.

get_attr(attr_name, value_if_not_found=None)[source]

Returns the value of the user-defined attribute.

Parameters
  • attr_name (str or hashable) –

  • value_if_not_found (object) – If attribute was not set previously, returns this value

Returns

attr_value – Value of the attribute if found. Or value_if_not_found if attribute is not found.

Return type

object

normalize(method='cosine')[source]

Normalize the kernel matrix to have unit diagonal.

Cosine normalization implements definition according to Section 5.1 in Shawe-Taylor and Cristianini, “Kernels Methods for Pattern Analysis”, 2004

Parameters

method (str) – Identifier of the method.

Returns

Return type

None

property normed_km

Access to the normalized kernel matrix.

property num_samples

Returns the number of samples in the sample this kernel is attached to.

This would be a scalar when the current instance is attached to a single sample. When a product of two samples i.e. K(X,Y) instead of K(X,X), it is an array of 2 scalars representing num_samples from those two samples.

set_attr(name, value)[source]

Sets user-defined attributes for the kernel matrix.

Useful to identify this kernel matrix in various aspects! You could think of them as tags or identifiers etc. As they are user-defined, they are ideal to represent user needs and applications.

Parameters
  • name (str or hashable) – Names of the attribute.

  • value (object) – Value of the attribute

property size

Returns the size of the KernelMatrix (total number of elements) i.e. num_samples from which the kernel matrix is computed from. In a single-sample case, it is the num_samples in the dataset. In two-sample case, it is the product of num_samples from two datasets.

Defining this to correspond to .size attr of numpy arrays

Special Kernel Matrices

class kernelmethods.base.ConstantKernelMatrix(num_samples, value=0.0, name='Constant', dtype='float')[source]

Bases: object

Custom KernelMatrix (KM) to efficiently represent a constant.

Parameters
  • num_samples (int) – Number of samples (size) for this KM

  • value (float) – Constant value for all elements in this KM

  • name (str) – Identifier and name for this KM

  • dtype (dtype) – Data type for the constant value

property diag

Returns the diagonal of the kernel matrix

property full

Returns the full kernel matrix (in dense format)

property shape

Shape of the kernel matrix

property size

Size of kernel matrix

class kernelmethods.base.KernelMatrixPrecomputed(matrix, name=None)[source]

Bases: object

Convenience decorator for kernel matrices in ndarray or simple matrix format.

property diag

Returns the diagonal of the kernel matrix

property full

Returns the full kernel matrix (in dense format, as its already precomputed)

property size

size of kernel matrix

Exceptions

class kernelmethods.KMAccessError[source]

Bases: kernelmethods.config.KernelMethodsException

Exception to indicate invalid access to the kernel matrix elements!

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.