Utilities¶
Here, we document several important utilities related to this library, including sampling and ranking.
Sampling¶
-
kernelmethods.sampling.
correlation_km
(k1, k2)[source]¶ Computes [pearson] correlation coefficient between two kernel matrices
- Parameters
k2 (k1,) – Two kernel matrices of the same size
- Returns
corr_coef – Correlation coefficient between the vectorized kernel matrices
- Return type
float
-
kernelmethods.sampling.
ideal_kernel
(targets)[source]¶ Computes the kernel matrix from the given target labels.
-
kernelmethods.sampling.
make_kernel_bucket
(strategy='exhaustive', normalize_kernels=True, skip_input_checks=False)[source]¶ Generates a candidate kernels based on user preferences.
- Parameters
strategy (str) – Name of the strategy for populating the kernel bucket. Options: ‘exhaustive’ and ‘light’. Default: ‘exhaustive’
normalize_kernels (bool) – Flag to indicate whether to normalize the kernel matrices
skip_input_checks (bool) – Flag to indicate whether checks on input data (type, format etc) can be skipped. This helps save a tiny bit of runtime for expert uses when data types and formats are managed thoroughly in numpy. Default: False. Disable this only when you know exactly what you’re doing!
- Returns
kb – Kernel bucket populated according to the requested strategy
- Return type
-
kernelmethods.sampling.
pairwise_similarity
(k_bucket, metric='corr')[source]¶ Computes the similarity between all pairs of kernel matrices in a given bucket.
- Parameters
k_bucket (KernelBucket) – Container of length num_km, with each an instance
KernelMatrix
metric (str) – Identifies the metric to be used. Options:
corr
(correlation coefficient) andalign
(centered alignment).
- Returns
pairwise_metric – A symmetric matrix computing the pairwise similarity between the various kernel matrices
- Return type
ndarray of shape (num_km, num_km)
Ranking¶
Module gathering techniques and helpers to rank kernels using various methods and metrics, such as
their target alignment,
performance in cross-validation
-
kernelmethods.ranking.
CV_ranking
(kernel_bucket, targets, num_folds=3, estimator_name='SVM')[source]¶ Ranks kernels by their performance measured via cross-validation (CV).
- Parameters
kernel_bucket (KernelBucket) –
targets (Iterable) – target values of the sample attached to the bucket
num_folds (int) – Number of folds for the CV to be employed
estimator_name (str) – Name of a valid Scikit-Learn estimator. Default:
SVM
- Returns
scores – CV performance computed for the kernel matrices in the bucket
- Return type
ndarray
-
kernelmethods.ranking.
alignment_ranking
(kernel_bucket, targets, **method_params)[source]¶ Method to rank kernels that depend on target alignment.
-
kernelmethods.ranking.
find_optimal_kernel
(kernel_bucket, sample, targets, method='align/corr', **method_params)[source]¶ Finds the optimal kernel for the current sample given their labels.
- Parameters
kernel_bucket (KernelBucket) – The collection of kernels to evaluate and rank
sample (ndarray) – The dataset given kernel bucket to be evaluated on
targets (ndarray) – Target labels for each point in the sample dataset
method (str) – identifier for the metric to choose to rank the kernels
- Returns
km – Instance of KernelMatrix with the optimal kernel function
- Return type
-
kernelmethods.ranking.
get_estimator
(learner_id='svm')[source]¶ Returns a valid kernel machine to become the base learner of the MKL methods.
Base learner must be able to accept a precomputed kernel for fit/predict methods!
- Parameters
learner_id (str) – Identifier for the estimator to be chosen. Options:
SVM
andSVR
. Default:SVM
- Returns
base_learner (Estimator) – An sklearn estimator
param_grid (dict) – Parameter grid (sklearn format) for the chosen estimator.
-
kernelmethods.ranking.
rank_kernels
(kernel_bucket, targets, method='align/corr', **method_params)[source]¶ Computes a given ranking metric for all the kernel matrices in the bucket.
Choices for the method include: “align/corr”, “cv_risk”
- Parameters
kernel_bucket (KernelBucket) –
targets (Iterable) – target values of the sample attached to the bucket
method (str) – Identifies one of the metrics:
align/corr
,cv_risk
method_params (dict) – Additional parameters to be passed on to the method chosen above.
- Returns
scores – Values of the ranking metrics computed for the kernel matrices in the bucket
- Return type
ndarray