in Others recategorized by
652 views
0 votes
0 votes

Consider the table below, where the $(i, j)^{t h}$ element of the table is the distance between points $x_{i}$ and $x_{j}$. Single linkage clustering is performed on data points, $x_{1}, x_{2}, x_{3}, x_{4}, x_{5}$.
\begin{array}{|c|c|c|c|c|c|}
\hline & x_{1} & x_{2} & x{3} & x_{4} & x_{5} \\
\hline x_{1} & 0 & 1 & 4 & 3 & 6 \\
\hline x_{2} & 1 & 0 & 3 & 5 & 3 \\
\hline x_{3} & 4 & 3 & 0 & 2 & 5 \\
\hline x_{4} & 3 & 5 & 2 & 0 & 1 \\
\hline x_{5} & 6 & 3 & 5 & 1 & 0 \\
\hline
\end{array}

Which ONE of the following is the correct representation of the clusters produced?

  1. GATE DS&AI 2024 | Question-136
  2. GATE DS&AI 2024 | Question-136
  3. GATE DS&AI 2024 | Question-136
  4. GATE DS&AI 2024 | Question-136

in Others recategorized by
by
652 views

2 Answers

0 votes
0 votes
option A is correct.
0 votes
0 votes

These tree like structures are called the Dendogram which is useful for the similarity measurement in the cluster analysis as well as to detect the outliers.

The similarity between two objects in a Dendogram is represented as a height of the lowest internal node they share.

In a Single linkage (nearest neighbor) method, the distance between  the two clusters is determined by the distance of the two closest objects (nearest neighbors) in the different clusters.

Now, comes to the given problem.

Here, we are given $5$ data points $x_1,x_2,x_3,x_4,x_5$.

Now, the least distance between these data points is $1$ that is between $x_1$ and  $x_2$ $\&$ $x_4$ and  $x_5$

It means $x_1$ and  $x_2$ $\&$ $x_4$ and  $x_5$ will form $2$ clusters. First cluster is having data points $x_1$ and  $x_2$ $\&$ second cluster is having data points $x_4$ and  $x_5.$

So, either $(A)$ is correct or $(B)$ is correct.

Now, due to single linkage, we have to find the distance of $x_3$ between the above two clusters.

(1) Distance between $x_3$ and $[x_1,x_2]:$

$d(x_3,[x_1,x_2])=\min(d(x_1,x_3),d(x_2,x_3))=\min(4,3)=3$    

(2) Distance between $x_3$ and $[x_4,x_5]:$

$d(x_3,[x_4,x_5])=\min(d(x_4,x_3),d(x_5,x_3))=\min(2,5)=2$

So, $x_3$ will be merged with cluster  $[x_4,x_5]$

Therefore,

Answer is A.

Related questions