Homepage of KerMet-Tools
Please send any comments, problem reports, suggestions, etc. to
KerMet-Tools V1.2 March 16, 2009
Kernel-Method MATLAB Tools
Copyright (C) 2005-2009 Bernard Haasdonk
This toolbox provides MATLAB routines for pattern analysis with
kernel methods. Excellent textbooks on this topic comprise
[Sc:Sm:02], [Sh:Cr:04] or [He:02]. The implemented kernel methods include
SVM, KPCA, kernel nearest-neighbour, kernel perceptron, enclosing
hypersphere algorithm, kernel fisher discriminant and kernel-nearest
Additionally it provides different kernel functions which incorporate
invariance knowledge as presented in [Ha:Vo:Bu:05], [Ha:Ke:02] and [Ha:Bu:07].
The use of the functions is exemplified by the main demo KerMet_Toy.
This is the release of KerMet-Tools version 1.2 on March 4, 2009.
The toolbox was tested on Linux and Windows XP using
MATLAB versions R2007a and R2008a.
This library is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
README.txt : This file
gpl.txt : Gnu Public License
BUGLIST.txt : list of known bugs
MATLAB distance matrices:
EDsq_matrix.m : Euclidean distance matrix
IDsq_matrix.m : Regularized invariant distance matrix, cf. [Ha:Bu:07]
TDsq_matrix.m : Tangent distance matrix, cf. [Ha:Ke:02]
MATLAB kernel matrices:
rbf_kernel_matrix.m : Gaussian rbf kernel matrix (pd)
tria_kernel_matrix.m : triangular kernel kernel matrix (cpd)
linear_kernel_matrix.m : linear kernel matrix (pd)
poly_kernel_matrix.m : polynomial kernel matrix (pd)
pseudo_euclidean_kernel_matrix.m : pseudo-Euclidean inner product(indefinite)
rbf_DS_kernel_matrix.m : Gaussian rbf Distance Substitution matrix
tria_DS_kernel_matrix.m : triangular kernel Distance Substitution
linear_DS_kernel_matrix.m : linear kernel Distance Substitution
poly_DS_kernel_matrix.m : polynomial kernel Distance Substitution
HI_kernel_matrix.m : Haar-Integration kernel matrix
virtual_samples_kernel_matrix.m : kernel matrix of multiplied dataset
MATLAB kernel methods:
predict_kernel_nearest_neighbour.m : prediction of kernel k-nearest-neighbour
predict_two_class_C_SVM.m : training and prediction of binary SVM
with C-penalty parameter
predict_kernel_PCA.m : principal component decomposition and
projection in kernel space
predict_kernel_perceptron.m : training and prediction with the kernel
perceptron, cf. [He:02]
predict_nu_hypersphere.m : Training and prediction of the minimum
enclosing hypersphere algorithm in the
"nu" penalty parameter version
predict_kernel_fisher_discriminant.m : Training and Prediction of the kernel
predict_nearest_mean_classifier.m : Training and Prediction of the
nearest mean classifier in kernel space,
KerMet_Toy.m : function starting the kernel method demo
KerMet_Toy.fig : GUI layout of the demo
KerMet_Toy.txt : Help-text invoked and displayed by the demo
compute_TD1sq.m : computation of one-sided tangent distance
compute_tangents.m : computation of tangents to transformations
transform_vectors.m : transformation of patterns
normalize.m : normalization of vectors
orthonormalize.m : orthonormalization of vectors
arrow.m : routine for plotting an arrow with patch-head
Detailed description of the functions can be obtained by invoking the
MATLAB help command on the single files. An illustrated guide for the
demo KerMet_Toy can be found at
Users are requested to kindly help us by providing the feedback about errors
occured while usage. Also suggestions for modifications are very appreciated!
Bernard Haasdonk, firstname.lastname@example.org
Institute of Applied Analysis and Numerical Simulation
University of Stuttgart
[Ha:Ke:02] Haasdonk, B., Keysers, D., Tangent Distance Kernels for Support
Vector Machines. ICPR 2002, International Conference on Pattern Recognition,
[Ha:Ba:Bu:04] Haasdonk, B., Bahlmann, C. Learning with Distance Substitution
Kernels. Pattern Recognition - Proc. of the 26th DAGM Symposium, Tï¿½bingen,
Germany, August/September 2004. Springer Berlin, 2004.
[Ha:Bu:07] Haasdonk, B. and Burkhardt, H. Invariant Kernels for Pattern
Analysis and Machine Learning. Machine Learning, 68:35-61, 2007.
[Ha:Vo:Bu:05] Haasdonk, B., Vossen, A. and Burkhardt, H., Invariance in
Kernel Methods by Haar-Integration Kernels. SCIA 2005, Scandinavian
Conference on Image Analysis, pp. 841-851, Springer-Verlag, 2005.
[He:02] Herbrich, R. Learning Kernel Classifiers. MIT Press, 2002.
[Mi:Ra:We:Sc:Mu:99] Mika, S., Raetsch, G., Weston, J., Schoelkopf, B.
and Mueller, K.-R., Fisher discriminant analysis with kernels. In Neural
Networks for Signal Processing, pp. 41-48, 1999
[Sc:Sm:02] Schoelkopf, B. and Smola, A. Learning with Kernels: Support Vector
Machines, Regularization, Optimization and Beyond. MIT Press, 2002.
[Sh:Cr:04] Shawe-Taylor, J. and Cristianini, N. Kernel Methods for Pattern
Analysis. Cambridge University Press, 2004.
This GUI enables interactive exploration of various kernel methods, kernel
functions and invariant kernel types. The kernels, kernel-methods etc. are
encapsulated in separate m-files for use as a general "kernel method
library". Therefore, see the corresponding files for detailed help on the
The main interactives in the GUI are grouped as follows:
Two-class data can be set by choosing the corresponding class ("Set Class
+1","Set Class -1") and klicking points in the plot frame. Generated point
distributions can be cleared ("Clear"), stored in files ("save") and
recovered ("load"). If the kernel method does not use class information,
the points are plotted in black, otherwise blue/red discrimination is
Here either 2D or 3D plot-mode can be choosen. The important parameter
"Grid Resolution" should be set adapted to the system speed. In particular
for expensive kernel methods (kernel-k-nn, KPCA) this should be lowered
during parameter tuning. Alternatively, the automatic replotting mode can
be switched off by unchecking "Immediate Plot" and pressing the appearing
"Replot" button if needed. The "Help" button displays this help text.
Base Kernel Type:
Here the base kernel and its parameters can be set. In particular these
comprise the standard linear, polynomial and Gaussian rbf-kernels. Also
non-pd kernels are included, e.g. the negative distance kernel, which is
conditionally pd (for 0 <= beta<= 2) and the pseudo-Euclidean inner
product, which is the prototype of a general indefinite inner-product.
Here the kernel method and its parameters can be set. Each method mainly
relies on two functions train_* and predict_*, where * replaces the
1. Single kernel:
The first data point is taken as fixed argument in the specified kernel
function. The second point is varying over the 2D-plane, the kernel values
are computed and color coded. This function is very well suited for simply
visualizing single kernels and their behaviour.
2. Kernel k-nearest-neighbour:
The k-nearest-neighbour algorithm for classification only relies on
distances. After defining a kernel function, the distances in the induced
feature-space can be computed from the kernel matrix. Correspondingy, the
kernelized version of distance based classifiers such as
k-nearest-neighbours can be performed reflecting certain
base-kernel-properties. The size of the neighbourhood "k" can be chosen.
The output values are the certainty, i.e. the ratio of the
class-frequencies in the k-neighbourhood of each point.
3. Support vector machine (SVM):
The two-class C-SVM for classification is implemented. Correspondingly,
the soft-margin penalization parameter "C" can be set. The output is the
decision value and plotting of the margin.
4. Kernel perceptron:
The well known perceptron for two-class classification only relies on
inner products, so a kernelized version is straightforward. The number of
update steps can be specified and the preliminary solution is plotted. By
the ">>" and "<
The usual PCA can only extract two linear features (= projections on the
main variance directions) on two-dimensional data. The kernel PCA is much
more powerful: For n data points, n features can be constructed, which are
arbitrary nonlinear or invariant, etc. depending on the chosen kernel. The
components are sorted with decreasing variance. The component on which to
project can be chosen by the "<>" buttons and the edit field. For
indefinite kernels, the valuably negative variance components will be
located at the end of the spectrum.
6. Enclosing hypersphere:
For clustering or one-class classification, an enclosing sphere can be
defined for given data. A kernelized version as given in Shawe-Taylor &
Cristianini is implemented here. The parameter "nu", which is related to
the ratio of support vectors and bounded support vectors must be chosen.
For details on these kernel methods, refer to
Schoelkopf, B. and Smola, A. Learning with Kernels: Support Vector
Machines, Regularization, Optimization and Beyond. MIT Press, 2002.
Shawe-Taylor, J. and Cristianini, N. Kernel Methods for Pattern Analysis.
Cambridge University Press, 2004.
This field enables defining the pattern-variations, which should be
incorporated into kernels as invariances. This means, simple 2D-point
transformations are defined here, which define a set of "equivalent"
points for each point in the plane. A popupmenu enables to define
continuous transformations comprising
* simple translation in x, y and a combined x/y diagonal direction
* centered operations as rotation and scaling. The corresponding
reference point can be set by the appropriate button "Set Trafo Center"
* highly nonlinear transformation of shift along a sinusoidal curve.
* combined transformations: x/y-shift and rotation.
A checkbox "reflection" enables independent combination with a discrete
transformation, the reflection along a vertical axis, which is defined to
pass through the transformation center. The checkbox "Plot
Transformations" overlays the kernel plot with the transformations of all
Here the type of invariant kernel can be set.
1. Transformation-Integration Kernels (TI-Kernels):
The technique of integration over transformation groups or subsets thereof
leads to the TI-kernels. In general the kernels are defined for infinite
sets of transformations. However for computational feasibility, the
kernels here are computed by finite sampling of the transformation sets.
The number of trafo-evaluations can be adjusted by "trafo-evals". More
details on these kernels can be found in
Haasdonk, B., Vossen, A. and Burkhardt, H., Invariance in Kernel Methods
by Haar-Integration Kernels. SCIA 2005, Scandinavian Conference on Image
Analysis, pp. 841-851, Springer-Verlag, 2005.
2. Invariant Distance Substitution Kernels (IDS-Kernels):
By computing the distance between the sets of transformed patterns, these
distances can be substituted in ordinary distance- and inner-product-based
kernels. As with the HI-kernels, the sampling of the transformation sets
can be specified. Additionally, a "origin" must be chosen in case of
inner-product base-kernels. If no precise invariance is wanted, the
parameter "Regularization lambda" interpolates between the precise
invariant case (lambda=0) and the ordinary non-invariant base-kernel
(lambda=Inf). More on these kernels can be found in
Haasdonk, B. and Burkhardt, H. Invariant Kernels for Pattern Analysis
and Machine Learning.
Machine Learning, 68:35-61, 2007.
3. Tangent Distance Kernels (TD-Kernels):
By approximating the continuous transformation manifolds (so discrete
transformations are excluded here) with linear spaces, the distance
computation can be performed very efficiently. Again the inner-product
based kernels require a choice of an "DS Origin" for the distance
substitution. Again the invariance degree can be adjusted by a
regularization parameter. 3 different symmetric versions of tangent
distance are implemented, the two-sided (2S), the mean of two one-sided
(MN) and the midpoint tangent distance (MP). Additionally the
non-symmetric one-sided distance is available, which however has to be
taken with care in kernel methods, as the kernel matrices are not
symmetric. The tangents can be "normalized" before tangent distance
computation, and plotting of the tangents can be switched on/off by "Plot
Tangents". More on these kernels can be found in
Haasdonk, B., Keysers, D., Tangent Distance Kernels for Support Vector
Machines. ICPR 2002, International Conference on Pattern Recognition,
(C) Bernard Haasdonk, 2005-2009
Last modified on Mon Mar 16 17:00 2009 by B. Haasdonk