KernelMatrix class¶
KernelMatrix
is a self-contained class for the Gram matrix induced by a kernel function on a given sample. This class defines the central data structure for all kernel methods, as it acts a key bridge between input data space and the learning algorithms.
The class is designed in such a way that
computation is on-demand, meaning it only computes elements of the kernel matrix (KM) as needed, and nothing more. This can save a lot computation and storage, for large samples.
supports both callable as well as attribute access. This allows easy access to partial or random portions of the KM. Indexing the KM is aimed to be compliant that of numpy arrays as much as possible.
allows parallel computation of different part of the KM to speed up computation when
N
is largeallows setting of user-defined attributes to allow easy identification and differentiation among a collection of KMs when working in applications such as Multiple Kernel Learning (MKL)
implements basic operations such as centering and normalization (whose implementation differs from that of manipulating regular matrices)
exposes several convenience attributes to make advanced development a breeze
This library also provides the following convenience wrappers (see below for the API):
KernelMatrixPrecomputed
turns a precomputed kernel matrix into aKernelMatrix
class with all its attractive properties
ConstantKernelMatrix
that defines aKernelMatrix
with a constant everywherecustom exception
KMAccessError
to identify invalid access of the KM
API for KernelMatrix
¶
-
class
kernelmethods.
KernelMatrix
(kernel, normalized=True, name='KernelMatrix')[source]¶ KernelMatrix is a self-contained class for the Gram matrix induced by a kernel function on a sample.
KernelMatrix behaves just like numpy arrays in terms of accessing its elements:
KM[i,j] –> kernel function between samples i and j
KM[set_i,set_j] where len(set_i)=m and len(set_i)=n returns a matrix KM of size m x n, where KM_ij = kernel between samples set_i(i) and set_j(j)
- Parameters
kernel (BaseKernelFunction) – kernel function that populates the kernel matrix
normalized (bool) – Flag to indicate whether to normalize the kernel matrix Normalization is recommended, unless you have clear reasons not to.
name (str) – short name to describe the nature of the kernel function
-
attach_to
(sample_one, name_one='sample', sample_two=None, name_two=None)[source]¶ Attach this kernel to a given sample.
Any computations from previous samples and their results will be reset, along with all the previously set attributes.
- Parameters
sample_one (ndarray) – Input sample to operate on Must be a 2D dataset of shape (num_samples, num_features) e.g. MLDataset or ndarray When sample_two=None (e.g. during training), sample_two refers to sample_one.
name_one (str) – Name for the first sample.
sample_two (ndarray) – Second sample for the kernel matrix i.e. Y in K(X,Y) Must be a 2D dataset of shape (num_samples, num_features) e.g. MLDataset or ndarray The dimensionality of this sample (number of columns, sample_two.shape[1]) must match with that of sample_one
name_two (str) – Name for the second sample.
-
attributes
()[source]¶ Returns all the attributes currently set.
- Returns
attributes – Dict of the all the attributes currently set.
- Return type
dict
-
center
()[source]¶ Method to center the kernel matrix
- Returns
- Return type
None
- Raises
NotImplementedError – If the KM is attached two separate samples. Centering a KM is possible only when attached to a single sample.
-
property
centered
¶ Exposes the centered version of the kernel matrix
-
diagonal
()[source]¶ Returns the diagonal of the kernel matrix, when attached to a single sample.
- Raises
ValueError – When this instance is attached to more than one sample
-
property
frob_norm
¶ Returns the Frobenius norm of the current kernel matrix
-
property
full
¶ Fully populated kernel matrix in dense ndarray format.
-
property
full_sparse
¶ Kernel matrix populated in upper tri in sparse array format.
-
get_attr
(attr_name, value_if_not_found=None)[source]¶ Returns the value of the user-defined attribute.
- Parameters
attr_name (str or hashable) –
value_if_not_found (object) – If attribute was not set previously, returns this value
- Returns
attr_value – Value of the attribute if found. Or value_if_not_found if attribute is not found.
- Return type
object
-
normalize
(method='cosine')[source]¶ Normalize the kernel matrix to have unit diagonal.
Cosine normalization implements definition according to Section 5.1 in Shawe-Taylor and Cristianini, “Kernels Methods for Pattern Analysis”, 2004
- Parameters
method (str) – Identifier of the method.
- Returns
- Return type
None
-
property
normed_km
¶ Access to the normalized kernel matrix.
-
property
num_samples
¶ Returns the number of samples in the sample this kernel is attached to.
This would be a scalar when the current instance is attached to a single sample. When a product of two samples i.e. K(X,Y) instead of K(X,X), it is an array of 2 scalars representing num_samples from those two samples.
-
set_attr
(name, value)[source]¶ Sets user-defined attributes for the kernel matrix.
Useful to identify this kernel matrix in various aspects! You could think of them as tags or identifiers etc. As they are user-defined, they are ideal to represent user needs and applications.
- Parameters
name (str or hashable) – Names of the attribute.
value (object) – Value of the attribute
-
property
size
¶ Returns the size of the KernelMatrix (total number of elements) i.e. num_samples from which the kernel matrix is computed from. In a single-sample case, it is the num_samples in the dataset. In two-sample case, it is the product of num_samples from two datasets.
Defining this to correspond to .size attr of numpy arrays
Special Kernel Matrices¶
-
class
kernelmethods.base.
ConstantKernelMatrix
(num_samples, value=0.0, name='Constant', dtype='float')[source]¶ Bases:
object
Custom KernelMatrix (KM) to efficiently represent a constant.
- Parameters
num_samples (int) – Number of samples (size) for this KM
value (float) – Constant value for all elements in this KM
name (str) – Identifier and name for this KM
dtype (dtype) – Data type for the constant value
-
property
diag
¶ Returns the diagonal of the kernel matrix
-
property
full
¶ Returns the full kernel matrix (in dense format)
-
property
shape
¶ Shape of the kernel matrix
-
property
size
¶ Size of kernel matrix
-
class
kernelmethods.base.
KernelMatrixPrecomputed
(matrix, name=None)[source]¶ Bases:
object
Convenience decorator for kernel matrices in ndarray or simple matrix format.
-
property
diag
¶ Returns the diagonal of the kernel matrix
-
property
full
¶ Returns the full kernel matrix (in dense format, as its already precomputed)
-
property
size
¶ size of kernel matrix
-
property