
Clustering algorithms are unsupervised learning algorithms. The learning process does not involve an external teacher providing pre-classified data for training. This type of learning system organises unclassified objects into a hierarchy of classes by measuring the similarities between objects and gathering maximally similar objects into the same group or cluster. The process of determining clusters can be done in either a bottom-up or top-down fashion.
The bottom-up (hierarchical) method of clustering recursively combine single objects or groups of objects into larger groups until the original set of objects merge into one single category which located at the top of the hierarchy.
The top-down (non-hierarchical) method recursively splits the original set of objects into subcategories until each object is assigned to a subcategory.
There are a number of techniques for calculating similarities between objects. The use of any of these methods depends on the relation between objects and groups of objects. This relationship might be object-to-group or group-to-group. If the clustering system is built on numerical similarity then the system cannot give any conceptual interpretation of the obtained categories; that is, they cannot give any explanation about clusters in human terms. Conceptual clustering systems attempt to overcome this.
As with other learning algorithms, clustering algorithms can be subdivided into non-incremental algorithms that rely on all the object descriptions being available prior to the clustering process and incremental algorithms that process objects as they become available.
The following clustering algorithms will be described in the proceeding sections :
Authored by Serengul Smith
E-mail to:
serengul1@mdx.ac.uk
School of Computing Science Middlesex University
Revised: September 1998