Supporting format migration with ontology model comparison
Autori
Viac o knihe
Being able to read successfully the bits and bytes stored inside a digital archive does not necessarily mean we are able to extract meaningful information from an archived digital document. If information about the format of a stored document is not available, the contents of the document are essentially lost. One solution to the problem is format conversion, but due to the amount of documents and formats involved, manual conversion of archived documents is usually impractical. There is thus an open research question to discover suitable technologies to transform existing documents into new document formats and to determine the constraints within which these technologies can be applied successfully. In the present work, it is assumed that stored documents are represented as formal description logic ontologies. This makes it possible to view the translation of document formats as an application of ontology matching, an area for which many methods and algorithms have been developed over the recent years. With very few exceptions, however, current ontology matchers are limited to element-level correspondences matching concepts against concepts, roles against roles, and individuals against individuals. Such simple correspondences are insufficient to describe mappings between complex digital documents. This thesis presents a method to refine simple correspondences into more complex ones in a heuristic fashion utilizing a modified form of description logic tableau reasoning. The refinement process uses a model-based representation of correspondences. Building on the formal semantics, the process also includes methods to avoid the generation of inconsistent or incoherent correspondences. In a second part, this thesis also makes use of the model-based representation to determine the best set of correspondences between two ontologies. The developed similarity measures make use of semantic information from both description logic tableau reasoning as well as from the refinement process. The result is a new method to semi-automatically derive complex correspondences between description logic ontologies tailored but not limited to the context of format migration.