The Classification algorithm

The basic classification algorithm is a non incremental concept learning method that produces a hypothesis in the form of a decision tree. The algorithm accepts a training set of positive and negative examples of a concept. Figure 3.06 is an example of a training set given by Quinlan (1983). In this training set people are described by three attributes which are height, hair colour and eye colour. Case 1 describes a short person with blond hair colour and brown eyes which is a negative example of the concept being learned. Case 3 describes a tall, blond haired person with blue eyes which is a positive example of the concept being learned.

Case

Height Hair Eye

Classification

1

short blond brown

-

2

tall dark brown

-

3

tall blond blue

+

4

tall dark blue

-

5

short dark blue

-

6

tall red blue

+

7

tall blond brown

-

8

short blond blue

+

Figure 3.06 : An example training set given in Quinlan (1983)

 

The following example applies the classification algorithm to the training set in figure 3.06. The decision tree and rules produced from applying the algorithm attempt to give a generalisation that can be used to classify objects not in the original training set.

One of the attributes is chosen, for example the 'height' attribute, which is made the root node of the tree. There are two possible values for height so two child nodes are created. These are 'short' and 'tall' and a branch is created for each of these as can be see on the tree shown in figure 3.07.

The input cases are assigned to the tree and if any of the nodes has all positive examples of the concept, or all negative examples of the concept on a branch, the branch is labelled '+' or '-' as appropriate. The numbers in brackets refer to the input cases in the training set.

 

Figure 3.07 : The decision tree after the first cycle of the classification algorithm

 

The process repeats for each branch that is not yet labelled '+' or '-'. The 'short' branch is next expanded and another attribute, not yet appearing on the path from the root, is picked randomly, for example 'hair'. The 'short' branch is labelled with the attribute name and legs grown for all the different values possible in the cases currently at the node.

In this case two branches are created for 'dark' and 'blond'. The input cases available on the 'short' branch in figure 3.07 are distributed over the newly added branches. There is one 'dark' case and the branch can be labelled '-'. The 'blond' branch, which now has cases 1 and 8 assigned to it - see figure 3.08, is next expanded and the 'eye' attribute is selected. Branches for 'blue' and 'brown' are grown which can be labelled '-' and '+'.

In a similar fashion the 'tall' branch in figure 3.07 is expanded to give the intermediate tree in figure 3.08 and the final tree in figure 3.09.

 

Figure 3.08 : The decision tree after the second cycle of the classification algorithm

 

Figure 3.09 : The completed classification algorithm decision tree

 

The classification algorithm can be summarised as follows. Firstly set a variable T to be the complete training set then apply the following 4 steps to T.

 

  • The inputs: A training set
  • The outputs: A decision tree
  • 1. If all elements in T are positive then create a 'yes' node and halt.
  • 2. If all elements in T are negative then create a 'no' node and halt.
  • 3. Otherwise select an attribute F with values v1, v2, v3, ... , vn. Partition T into subsets T1, T2, T3, ... Tn according to their values on F. Create a branch with F as parent and T1 etc. as child nodes.
  • 4. Apply the procedure recursively to each child node.
  •  

    The final tree in figure 3.09 is saying that anyone who is tall and has red hair is a positive example irrespective of the eye attribute values. Anybody who is tall and has blond hair and blue eyes is also a positive example. A short person with dark hair is a negative example. The rules for positive and negative classifications can be summarised as two rules, one for positive and one for negative. When forming these rules the levels of the tree are linked with 'and' and the breadth with 'or'.

    The negative rule extracted from this tree is :

    IF (height is short AND hair is dark)
    OR (height is tall AND hair is blond AND eyes are brown)
    OR (height is tall AND hair is dark)
    THEN - NO

    The positive rule extracted from the tree is :

    IF (height is short AND hair is blond AND eyes are blue)
    OR (height is tall AND hair is blond AND eyes are blue)
    OR (height is tall AND hair is red)
    THEN - YES

    In the example above there were eight training cases and the decision tree produced has seven leaf nodes. The decision tree produced is little more than a look up table for the data and the generalisation is quite poor.

    Table of Contents

    References



    Authored by Serengul Smith

    E-mail to: serengul1@mdx.ac.uk
    School of Computing Science Middlesex University
    Revised: September 1998