Symmetric parallel class expression learning : School of Engineering and Advanced Technology, Massey University, New Zealand : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science
The growth of both size and complexity of learning problems in description logic
applications, such as the Semantic Web, requires fast and scalable description logic
learning algorithms. This thesis proposes such algorithms using several related ap-
proaches and compares them with existing algorithms. Particular measures used for
comparison include computation time and accuracy on a range of learning problems of
di erent sizes and complexities.
The rst step is to use parallelisation inspired by the map-reduce framework. The
top-down learning approach, coupled with an implicit divide-and-conquer strategy, also
facilitates the discovery of solutions for a certain class of complex learning problems. A
reduction step aggregates the partial solutions and also provides additional
to customise learnt results.
A symmetric class expression learning algorithm produces separate de nitions of
positive (true) examples and negative (false) examples (which can be computed in
parallel). By treating these two sets of de nitions `symmetrically', it is sometimes
possible to reduce the size of the search space signi cantly. The use of negative example
denotions enhances learning problems with exceptions, where the negative examples
(`exceptions') follow a few relatively simple patterns.
In general, correctness (true positives) and completeness (true negatives) of a learn-
ing algorithm are traded o against each other because these two criteria are normally
icting. Particular learning algorithms have an inherent bias towards either cor-
rectness or completeness. The use of negative de nitions enables an approach (called
forti cation in this thesis) to improve predictive correctness by applying an appropriate
level of over-specialisation to the prediction model, while avoiding over- tting.
The experiments presented in the thesis show that these algorithms have the po-
tential to improve both the computation time and predictive accuracy of description
logic learning when compared to existing algorithms.