Lets Learn together... Happy Reading

" Two roads diverged in a wood, and I,
I took the one less traveled by,
And that has made all the difference "-Robert Frost

K means clustering on RGB image

                        I assume the readers of this post have enough knowledge on K means clustering method and it’s not going to take much of your time to revisit it again.


Let’s start with a simple example, consider a RGB image as shown below.


Let’s choose the number of clusters = 2. Basically we are going to separate the background (first cluster) and the flower (second cluster).

In the second step, let’s choose two random RGB pixel values.

Since the initial pixel values are completely random, we can clearly see that both the initial cluster points are on the flower and not on the background.
The pixel value of the cluster 1 (R1,G1,B1)=[255,191,59]
The pixel value of the cluster 2 (R2,G2,B2) = [255,245,11]

In the third step, find the Euclidean distance between the initial points and all the pixels in the image matrix.
We will see how to calculate the Euclidean distance between the initial points and the pixel value at (1,1) . The pixel value at (1,1) is [24,64,186]





Repeat the above procedure for all the pixel values.

The next step is finding the minimum value among the two clusters and fetch their corresponding cluster index. With reference to the pixel position at (1,1), the minimum value is 292.6072 and it belongs to the cluster 1. So update the matrix ‘ClusterMap’ with 1 at the position (1,1).
Similarly, find the index of the minimum value and update the ‘ClusterMap’ matrix for all the pixel positions.



The visual analysis of the first iteration shows that the clustering is not good and it still needs to be improved.
It’s time to look for new cluster points. Let’s see how to find new cluster points to perform better segmentation.
In order to obtain the new cluster points, compute the mean of the pixel values with respect to each cluster. Once we obtain the mean value for the Red, Green and Blue channels separately, find the difference between the new values and the current values.
In our chosen example,
The new values for the Red, Green and Blue channels are (R1_new,G1_new,B1_new)=[ 93.2942  106.6650  143.5914] for the first cluster
And (R2_new,G2_new,B2_new)=[ 252.1034  224.2628    3.8519] for the second cluster

The difference between the new and the current values with respect to different color channels are
[  161.7058   84.3350   84.5914] for first cluster (R1-R1_new,G1-G1_new,B1-B1_new)
 [  2.8966   20.7372    7.1481] for the second cluster (R2-R2_new,G2-G2_new,B2-B2_new)




If the difference is more than the threshold value then assign the new values to the current values and continue from the third step i.e calculating the Euclidean distance else stop the process and display the clustered image.

In our example, after the fifth iteration the difference between the new and the current values of the
  1st cluster [0.1615    0.1734    0.5675] and 2nd cluster [0.4701    0.5028    0.0364]  indicates that the values are less than 1 for each color channel individually.
The final segmented images are shown below.
%K means clustering
%Author: Aaron Angel

clear all
close all
clc
%READ A RGB IMAGE
A = imread('flower8.jpg');
figure,subplot(121),imagesc(A);title('RGB Image');hold on;

A = double(A);

%CARTESIAN GRID FOR THE 2D IMAGE
[Xp,Yp] = meshgrid(1:size(A,2),1:size(A,1));

%NUMBER OF CLUSTERS
num_clusters = 4;

%THRESHOLD VALUE FOR EACH CHANNEL
Tval = 1;

%GLOBAL THRESHOLD VALUE
Global_Tval = 3;

%RANDOM IMAGE POSITIONS
RX = randi(size(A,1),1,num_clusters);
RY = randi(size(A,2),1,num_clusters);



%PREALLOCATE THE VECTORS
RGB_val = zeros([num_clusters,3]);
RGB_new = zeros([num_clusters,3]);

%FETCH THE RGB PIXEL VALUE OF THE RANDOM IMAGE POSITIONS
for j = 1:num_clusters
   RGB_val(j,:) = A(RX(j),RY(j),:); 
end


myvox = zeros([size(A,1) size(A,2) num_clusters]);
flag = 1;

Rcomp = A(:,:,1); %RED CHANNEL
Gcomp = A(:,:,2); %GREEN CHANNEL
Bcomp = A(:,:,3); %BLUE CHANNEL
it=0;


while flag==1
 
 %EUCLIDEAN DISTANCE BETWEEN THE RANDOM RGB PIXEL VALUE
 %AND THE PIXEL VALUES IN THE IMAGE
 for j = 1: num_clusters
     myvox(:,:,j) = sqrt((A(:,:,1)-RGB_val(j,1)).^2+(A(:,:,2)-RGB_val(j,2)).^2+(A(:,:,3)-RGB_val(j,3)).^2);
 end

%FIND THE MINIMUM VALUE(Y) AND THE CORRESPONDING CLUSTER NUMBERS(ClusterMap)
[Y,ClusterMap] = min(myvox,[],3);


%MEAN VALUE PIXEL VALUES IN EACH CHANNEL WITH RESPECT TO THE CLUSTERS
for j = 1:num_clusters
  
   RGB_new(j,1) = mean(Rcomp(ClusterMap(:)==j));
   RGB_new(j,2) = mean(Gcomp(ClusterMap(:)==j));
   RGB_new(j,3) = mean(Bcomp(ClusterMap(:)==j));
end

%DIFFERENCE BETWEEN THE NEW VALUE AND THE OLD VALUE
DiffVal = abs(RGB_val-RGB_new);


%IF THE DIFFERENCE IS LESS,STOP THE ITERATION ELSE
%ASSIGN THE NEW VALUE AND CONTINUE
if(sum(DiffVal(:)) < Global_Tval)
    flag = 0;
else
  if(sum(DiffVal(:,1))>Tval)
    RGB_val(:,1) = RGB_new(:,1);
  end
  if(sum(DiffVal(:,2))>Tval)
    RGB_val(:,2) = RGB_new(:,2);
  end
  if(sum(DiffVal(:,3))>Tval)
     RGB_val(:,3) = RGB_new(:,3);
  end
end




%NUMBER OF ITERATIONS
it=it+1

end;
subplot(122),imagesc(ClusterMap);title('Clusters');colormap(jet);


%DISPLAY THE SEGMENTED IMAGE BASED ON COLORS
m=2;
n = round(num_clusters/m);
for k=1:num_clusters
F = repmat(logical(ClusterMap==k),1,1,3).*double(A);
figure(2),subplot(n,m,k),imagesc(uint8(F));hold on;
end



EXPLANATION:


In this example, the number of clusters are 4, so the yellow, pink and blue flowers are defined as clusters separately while the remaining details are combined as one cluster.



Hope you all enjoyed the post. Feel free to comment and join us in Facebook and twitter for more updates.
like button Like "IMAGE PROCESSING" page

K-means Clustering


       Clustering can be defined as the grouping of data points based on some commonality or similarity between the points.

                                                                              
         One of the simplest methods is K-means clustering. In this method, the number of clusters is initialized and the center of each of the cluster is randomly chosen. The Euclidean distance between each data point and all the center of the clusters is computed and based on the minimum distance each data point is assigned to certain cluster. The new center for the cluster is defined and the Euclidean distance is calculated. This procedure iterates till convergence is reached.
Let’s see how to generate some random data points and some random cluster points


%Generate sample data points and assign random centre for each cluster
%Number of data points
sz=100;
sz1=250;

X = random('unid',sz1,[sz 1]); %Value
Xp = random('unid',sz1,[sz 1]); %Position

%Number of clusters
c=6;
V=random('unid',sz1,[c 1]); %Value
Vp=random('unid',sz1,[c 1]); %Position


figure,plot(Xp,X,'*',Vp,V,'r+');title('Data points and the initial Cluster centers');


Explanation:
100 data points are generated and the number of clusters are assumed to be 6 and 6 random cluster points are generated.

Group the data points:
MATLAB CODE:

%Preallocate the vectors
V1=zeros(size(V));
Vp1=zeros(size(Vp));
flag=1;
while flag==1
%Find the euclidean distance between the data points and all the center of
%the clusters
J=sqrt(abs(repmat(Xp,[1 c])-repmat(Vp,[1 sz])').^2+abs(repmat(X,[1 c])-repmat(V,[1 sz])').^2);

%Find the minimum distance between the data point and the cluster
[mv,Gpos]=min(J,[],2);
CGroup=zeros([sz c]);
colr=colormap(jet(c));
figure(3),
for i = 1:c
    Temp = find(Gpos==i);
    CGroup(1:numel(Temp),i)=Temp;
   
    V1(i,:)=mean(X(Temp));
    Vp1(i,:)=mean(Xp(Temp));
    Pos=ones(numel(Temp)*2,1)*Vp1(i);
    Pos(2:2:end)=Xp(Temp);
    Value=ones(numel(Temp)*2,1)*V1(i);
    Value(2:2:end)=X(Temp);
    %Define the new centre for each cluster
    plot(Pos,Value,'Color',colr(i,:),'LineStyle','-','Marker','o');hold on;
    plot(Vp1(i),V1(i),'k+');
end
hold off;
Diffv=abs(V-V1);
DiffVp=abs(Vp-Vp1);

%Iterate the process till there is no change in the cluster position
if(Diffv < 1)
    flag=0;
else
    V=V1;
    Vp=Vp1;
end
end

figure,plot(Xp,X,'*',Vp,V,'g+');title('Data points and the Final Cluster centers');


Fig. The new position(red circle) of the clusters after the final iteration.

Explanation:


In the above figure, the data points are represented in blue color stars and the cluster centers are represented in red color cross shape.

Let’s consider one particular data point and all the cluster centers.



Data point position X = 13, Y = 20
Cluster 1 position  X = 8,  Y = 19
Cluster 2 position  X = 13, Y = 15
Step 1: Find the Euclidean Distance:
Find the Euclidean distance(D1) between data point and the cluster 1 similarly, find the Euclidean distance(D2) between data point and the cluster 2

Distance D1 = sqrt((13-8).^2+(20-19).^2)) = 5.0990
Distance D2 = sqrt((13-13).^2+(20-15).^2))= 5.0000

Step 2: Find the minimum and assign the data point to a cluster

Now the minimum distance among the two results is for the cluster 2.
So the data point with (X,Y)=(13,20) is assigned to the cluster/group 2.

Step 3: Perform the step 1 and step 2 for all the data points and assign group accordingly.

Step 4: Assign a new position for the clusters based on the clustering done.
       
        Find the average position of the newly assigned data points to a particular cluster and use that average as the new position for the cluster.       

Step 5: Iterate this procedure till the position of the clusters are unchanged.

Number of clusters = 3

 Number of clusters = 6

Number of clusters = 15


like button Like "IMAGE PROCESSING" page
Next Post Home
Google ping Hypersmash.com