|
Minimum Distance Classifiers
|
||
|
Notations and problem setting |
||
|
(For basic definitions, go to this page.) We have a dataset D containing n sample vectors of dimension d. Some of these sample vectors are labelled, some are not. We now want to classify (label) the unlabeled sample vectors, given the labelled ones. The method will be to compute the mean of each class and measure the distance to the unlabelled vector according to some metric. The class with a mean closest to the vector (with smallest distance) will be the label for that vector. For definitions of the different metrics, go to this page.
|
2D example: given sample vectors with red (circles) and blue (x) labels, what is the label of the green one (cross)? |
|
|
The minimum distance to the classes' means |
||
|
The first step is to compute, for each kind of label Li found in the dataset, the mean of all sample vectors that were labelled with that particular label. Then, the distance between each unlabelled sample vector x we want to classify, and the means of all labelled sample vectors is minimized to give the class. If we would use the Euclidean distance metric for instance, we would find the class L by:
The idea behind this is that the mean should be a representative value for the class, defining usually the centre of all the sample vectors that were labelled as that class in input space.
There are limitations in using the Euclidean distance metric, however, since it is linear in nature:
Using the Mahalanobis distance metric helps in most of these cases.
|
schematic overview of the minimum distance procedure, where m stands for the mean of a certain class
illustrations of the limitations 1 - 5 of linear classifiers |
|
|
The maximum correlation classifier |
||
|
The Euclidean formula can be rewritten as:
so minimizing the first term will be equivalent to maximizing the term between the square brackets, this will lead to a new function: A linear discriminant function is defined as:
we thus take the maximum g value instead of taking the minimum Euclidean distance. This is also called maximizing the correlation (as opposed to the minimizing the distance). An invariant linear discriminant function can be obtained from the Mahalanobis metric if the assumption is made that the covariance matrix is the same for all classes:
The invariant linear discriminant function is then:
Boundaries are now linear, but they are invariant to linear transformations, plus the calculations are simpler as well. |
schematic overview of the maximal correlation procedure, where m stands for the mean of a certain class g can be a linear or invariant linear discriminant function |
|
Compiled by Kristof Van Laerhoven.
Most came straight from the following web pages: