Chapter 4 Cluster Analysis
Section 3 Agglomerative Hierarchical Clustering
Page 4 Key Issues

Objectives

The objectives of this section are:
to define agglomerative hierarchical clustering
to explain its basic algorithm
to briefly mention key issues it presents

Outcomes

By the time you have completed this section you will be able to:
define agglomerative hierarchical clustering
describe the algorithm
list key issues that this method creates/resolves

Key Issues in Hierarchical Clustering

Lack of a Global Objective Function: agglomerative hierarchical clustering techniques perform clustering on a local level and as such there is no global objective function like in the K-Means algorithm. This is actually an advantage of this technique because the time and space complexity of global functions tends to be very expensive. Other difficulties that arise with the use of a global objective function is choosing initial points and dealing with local minimums.

Ability to Handle Different cluster Sizes: we have to decide how to treat clusters of various sizes that are merged together. This issue presents itself when we are dealing with cluster proximity schemes that involve sums (centroids, group average). There are two main approaches, in the first approach clusters are all treated equally and this approach is called weighted. The second approach, unweighted, takes the number of points in each cluster into consideration.

Merging Decisions Are Final: one downside of this technique is that once two clusters have been merged they cannot be split up at a later time for a more favorable union. Agglomerative hierarchical clustering algorithms usually make good decisions about merging clusters because they have the necessary information and so when clusters have been merged this decision cannot be undone. This is an important characteristic because it prevents local optimization criterion from becoming a global issue.