
The focusing algorithm (Young, Plotkin and Linz, 1977) is considered by researchers to be one of the most powerful techniques to learn concepts. The algorithm aims to produce a definition that is consistent with all given positive data, but none of the negative. The focusing algorithm uses a version space search though the concept space. The concept space covers all the possible concept descriptions. The version space only covers concept descriptions which are consistent with the given training instances. The focusing algorithm is similar to Mitchells candidate elimination algorithm (Mitchell, Keller & Kedar-Cabelli, 1986)
In the focusing algorithm the concept being learned is represented by a set of trees. There is one tree for each attribute used to describe the concept.
Suppose we are concerned with the sizes of vehicles used to reach a particular destination. Attributes in this example are vehicle type and size. For each of these attributes a tree is required covering all possible instances that can occur. These may be as shown in figure 3.01. These two trees represent the concept space for the example.

Figure 3.01 : The concept space for the vehicle size example
An upper boundary (U) is initially placed above each tree and a lower boundary (L) is placed below. The version space lies between the U and L boundaries and constitutes the search space for that concept. Concepts are learned my moving the L and U boundaries upward and downwards respectively. Initially the version space covers the entire concept space.
The steps of the focussing algorithm are as follows :
|
|
|
|
|
|
|
Example:
Let a concept space be as described above and let the training instances be as follows :
| [med moped] | - |
positive example of the concept |
| [tiny car] | - |
positive example of the concept |
| [big jet] | - |
negative example of the concept |
| [med car] | - |
positive example of the concept |
| [med jet] | - |
negative example of the concept |
The first training example [med moped] is a positive example and so the lower bound moved to coincident with med and moped. The have been no previous training examples at this stage. The version space after the first training example is shown in figure 3.02.

Figure 3.02 : The version space after the first training example
The second training example [tiny car] is again a positive example of the concept. As the modes of transport mentioned in training examples so far are car and moped the lower bound moves to vehicle so as to cover both of these. Similarly the lower bound in the size tree moves to small. The resulting version space is shown in figure 3.03.

Figure 3.03 : The version space after the second training example
The third training example [big jet] is a negative example. A tree is selected in this case the size tree and U moved to small node. It cannot move to the large node as this is above the training instance. Figure 3.04 shows the version space after the third training example.

Figure 3.04 : The version space after the third training example
The fourth training example [med car] is again positive but does not result in any changes to the bounds as the lower bound already covers this instance.
The final training example [med jet] is a negative example. A tree is selected in this case the transport tree and the upper bound moved to vehicle. It cannot move to plane as this is directly above the instance. The version space after all the training examples have been processed is shown in figure 3.05.

Figure 3.05 : The version space after the final training example
There are no more training examples to be processed. As can seen in both trees the upper boundary U and the lower boundary L have converged at a node. Finally we can describe the concept being learned here as [small vehicle]. This description covers all the positive and none of the negative training data.
Authored by Serengul Smith
E-mail to:
serengul1@mdx.ac.uk
School of Computing Science Middlesex University
Revised: September 1998