The ID4 algorithm

The ID4 algorithm was developed by Schlimmer and Fisher (1986). This algorithm builds decision trees incrementally. Many learning tasks are incremental as new instances or details become available over time. The ID4 algorithm works by building a tree and updating it as new instances become available. Note that the ID3 algorithm can be used to learn incrementally by adding each new instance to the training set as it becomes available and re-running ID3 against the enlarged training set. This is, however, computationally inefficient.

The basic ID4 algorithm tree-update procedure is given below.

Inputs: One instance, a decision tree which is kept as a global data structure.
Outputs: An updated decision tree.
  • If the instance is positive, increment the total number of positive instances. Otherwise increment the number of negative instances.
  • If all the instances are positive or negative then terminate the process and return the decision tree.
  • Else
  • a) For each attribute, for each value present in the instance, increment either the number of positive or negative instances.

    b) Compute entropy values for all attributes and select an attribute with the lowest entropy score. If the attribute is not at the root then make it so and discard any sub trees.

    c) Make a test link from the root for every value of the root attribute.

    d) Recursively update the sub tree found by following the link for the root attribute's value in this instance.

    This tree produced by ID4 on the data from figure 3.06 is the same as that produced by ID3, however, not every concept that ID3 can learn can be learned by ID4. ID4 can also produce inconsistent trees. For example the trees after the 5th and 6th training instances are presented contain nodes with positive and negative classifications and so cannot be effectively used for classification.

    Table of Contents

    References



    Authored by Serengul Smith

    E-mail to: serengul1@mdx.ac.uk
    School of Computing Science Middlesex University
    Revised: September 1998