Visualize classifier decision boundaries in MATLAB

When I needed to plot classifier decision boundaries for my thesis, I decided to do it as simply as possible. Although the decision boundaries between classes can be derived analytically, plotting them for more than two classes gets a bit complicated. You have to find the intersection of regions that are all assigned to a particular class and then find the expression for the boundary of that class. If analytical boundaries are not necessary, then a brute force, computational approach can be used. This tutorial does such an approach: the feature space is divided up into a grid and then each grid cell is classified. The classified map is then shown as an image behind a scatter plot of the training data. This is an application of how to plot over an image background in MATLAB. The result will look something like this (for a city block classifier):

2D decision boundary plotted in MATLAB

Create Data

First, create some random training data for 3 classes. Here, I will use some random Gaussian-distributed points with a different mean for each class. After generating the samples, also compute the class means to use as the class prototype for the classifier distance measure. In this tutorial, city block distance from the class mean will be used as the distance measure, so only the class mean needs to be computed.

% number of training samples to generate.
nsamples = 20;
 
% create some training data for three classes.
training = cell(3,1);
training{1} = randn(nsamples,2) + repmat([-2 -2], [nsamples 1]);
training{2} = randn(nsamples,2) + repmat([2 3], [nsamples 1]);
training{3} = randn(nsamples,2) + repmat([-3 2], [nsamples 1]);
 
% sample mean
sample_means = cell(length(training),1);
 
% compute sample mean to use as the class prototype.
for i=1:length(training),
    sample_means{i} = mean(training{i});
end

Compute Decision Boundaries

The technique that will be used to plot the decision boundaries is to make an image, where each pixel represents a grid cell in the 2D feature space. The image defines a grid over the 2D feature space. The pixels of the image are then classified using the classifier, which will assign a class label to each grid cell. The classified image is then used as a background for a scatter plot that shows the data points of each class.

The advantage of this technique is that because it is classifying grid points in the 2D feature space, decision boundaries can be computed for any arbitrary distance measure. The disadvantage is that there is a computational cost for making very fine decision boundary maps, as you would have to make the grid finer and finer.

The code below sets this up.

% set up the domain over which you want to visualize the decision
% boundary
xrange = [-8 8];
yrange = [-8 8];
% step size for how finely you want to visualize the decision boundary.
inc = 0.1;
 
% generate grid coordinates. this will be the basis of the decision
% boundary visualization.
[x, y] = meshgrid(xrange(1):inc:xrange(2), yrange(1):inc:yrange(2));
 
% size of the (x, y) image, which will also be the size of the 
% decision boundary image that is used as the plot background.
image_size = size(x);
 
xy = [x(:) y(:)]; % make (x,y) pairs as a bunch of row vectors.

In the above code, the domain over which you want to display the decision boundaries is specified by xrange and yrange. The step size (inc) will determine how fine the resolution of the decision boundary image is, the smaller you make this, the smoother the final result.

The meshgrid function generates two images, x and y:

The output of meshgrid

As can be seen above, the pixel values in the x image increase from left to right. Each row of x in fact has the value xrange(1):inc:xrange(2) and each column of y has the value yrange(1):inc:yrange(2). The meshgrid function is giving two images that, together, define a grid in the 2D feature space. If you take corresponding pixels and write them as a coordinate pair, you get the location of a grid cell. When you re-arrange each image to a column vector and then concatenate them together (as in the last row in the code above), you form a number of row vectors, where each row represents an (x, y) coordinate pair of a grid cell in the 2D feature space. Re-arranging in this fashion is not strictly necessary but will make the following code a bit easier to follow. Also, instead of issuing x(:) you can use reshape to do the same thing but the code will be much longer:

xy = [reshape(x, image_size(1)*image_size(2),1) reshape(y, image_size(1)*image_size(2),1)]

Now all that needs to be done is classify each coordinate pair and then re-arrange the result back into an image. That image will indicate the class label for each pixel. But because this is all derived from the meshgrid result, the label for each image pixel is actually the label for that grid cell in the 2D feature space.

Classification of the xy data is accomplished like so:

numxypairs = length(xy); % number of (x,y) pairs
 
% distance measure evaluations for each (x,y) pair.
dist = [];
 
% loop through each class and calculate distance measure for each (x,y)
% from the class prototype.
for i=1:length(training),
 
    % calculate the city block distance between every (x,y) pair and
    % the sample mean of the class.
    % the sum is over the columns to produce a distance for each (x,y)
    % pair.
    disttemp = sum(abs(xy - repmat(sample_means{i}, [numxypairs 1])), 2);
 
    % concatenate the calculated distances.
    dist = [dist disttemp];
 
end
 
% for each (x,y) pair, find the class that has the smallest distance.
% this will be the min along the 2nd dimension.
[m,idx] = min(dist, [], 2);

The distance measure (in this case city block distance) from each class mean is calculated for each grid cell by the loop. This is concatentated into an array (dist). Note though that within the loop the distance for all coordinate pairs is calculated in one instruction (disttemp). Languages like MATLAB allow this sort of “vectorization” and it is much faster than looping through each grid cell and calculating the value one at a time. I did not use vectorization for the first maximum likelihood classifier I had to code and it was about 100 times slower than the vectorized version.

Once the distances are calculated for each class, the final line in the code above determines the class to assign to each grid cell by finding the class that has the minimum distance. Here, the min function's idx output is used because the indices have the same number as the class number with the minimum distance.

The idx variable ocntains the label information that is needed but it is one long column vector. The important thing to note is that because each element in idx corresponds exactly to each coordinate pair in xy and because each xy element corresponds exactly to a pixel coordinate (or raster / image coordinate) in the x and y images, undoing the re-arranging when xy was created will produce an image where the classifier decision is shown for each grid cell. This is how:

% reshape the idx (which contains the class label) into an image.
decisionmap = reshape(idx, image_size);

Show Decision Boundaries

Once the image is created, it can simply be shown:

figure;
 
%show the image
imagesc(xrange,yrange,decisionmap);
hold on;
set(gca,'ydir','normal');
 
% colormap for the classes:
% class 1 = light red, 2 = light green, 3 = light blue
cmap = [1 0.8 0.8; 0.95 1 0.95; 0.9 0.9 1]
colormap(cmap);
 
% plot the class training data.
plot(training{1}(:,1),training{1}(:,2), 'r.');
plot(training{2}(:,1),training{2}(:,2), 'go');
plot(training{3}(:,1),training{3}(:,2), 'b*');
 
% include legend
legend('Class 1', 'Class 2', 'Class 3','Location','NorthOutside', ...
    'Orientation', 'horizontal');
 
% label the axes.
xlabel('x');
ylabel('y');

This will then get you the decision boundary plot:

2D decision boundary plotted in MATLAB

The three lines that show the image are an interesting example of how to plot over an image background. If you have read that tutorial, you will probably notice that it is actually one of the “wrong” examples (Example 2), in that the image is supposed to be flipped. To understand why in this case, the image is not flipped, take a look at these lines from the code above:

imagesc(xrange,yrange,decisionmap);
hold on;
set(gca,'ydir','normal');

According to the other post, this will make a flipped image. And yet this decision boundary is shown correctly. The reason is that when meshgrid generated x and y, it generated them such that increasing raster row coordinates represent increasing Y coordinates of the grid. Raster row 0 is always associated with the smallest Y-coordinate shown on the axes, which is MATLAB's default way of showing images. Thus, all that's needed is to set the Y-axis direction to “normal” in order to get small Y at the bottom of the plot. This means that the decision boundary image shows correctly without all the flipping necessary in the other tutorial. In the case of the other tutorial, the image must be shown with raster row 0 at the top (since it is showing an image depicting a meanigful scene). In this tutorial, raster row 0 simply needs to be shown with the smallest Y-value on the axes.

1D Decision Boundaries

This same technique is adaptable to 1D decision boundaries too. You just simply create a 1 pixel high image that you classify and then show this as a background in a plot of your probability density function or histogram.

Discussion

ayman, 2011/12/05 02:59
please i have aproblem with adaptive decision boundary using matlab

Write matlab program for implement the adaptive decision boundary algorithm with c=k=1 Assume two classes ,N training patterns from each class and M features for each pattern
Peter Yu, 2011/12/05 19:24
ayman - It would be best to ask your instructor for help on your problem.
ayman, 2011/12/06 17:36
i have no instructor
and i have self study education
thank u
i swear that i have no instructor
please help me if u know
Andrej, 2012/03/31 10:05
Very warm thanks for writing this up!
devsjee, 2014/09/21 10:59
Hello Sir..

This explanation is very logical. Thank you.

However computing the classes for all grid points is a time consuming operation,Particularly when we choose step size to be of order 0.1. I've been trying to find out ways of plotting the decision boundaries among 4 classes.. The above method is the first working method that helped me to grasp the idea. I'm thinking about using linear regression to plot the descriminant functions as an alternative way to check if plotting time can be reduced.. Kindly guide me on the other analytical methods that could possibly be explored in this regard.



Thanks again for the neat explanation.
Alaa Tharwat, 2014/11/05 03:48
Really, Many thanks.

This is very good code and explanation.

Thanks again and again.

Best regards,

Note: I want to see more explanation related to machine learning.

Alaa
I would love to hear your feedback. Enter your comment below [ Terms of Use ]:
TVQOB
 

About Peter Yu I am a research and development professional with expertise in the areas of image processing, remote sensing and computer vision. I received BASc and MASc degrees in Systems Design Engineering at the University of Waterloo. My working experience covers industries ranging from district energy to medical imaging to cinematic visual effects. I like to dabble in 3D artwork, I enjoy cycling recreationally and I am interested in sustainable technology. More about me...

Feel free to contact me with any questions about this site at [user]@[host] where [user]=web and [host]=peteryu.ca

Copyright © 1997 - 2019 Peter Yu