OPUS Würzburg | 004 Datenverarbeitung; Informatik

004 Datenverarbeitung; Informatik

Refine

Has Fulltext

yes (19)

19 search hits

10 to 10

Design and Implementation of Architectures for Interactive Textual Documents Collation Systems (2011)

One of the main purposes of textual documents collation is to identify a base text or closest witness to the base text, by analyzing and interpreting differences also known as types of changes that might exist between those documents. Based on this fact, it is reasonable to argue that, explicit identification of types of changes such as deletions, additions, transpositions, and mutations should be part of the collation process. The identification could be carried out by an interpretation module after alignment has taken place. Unfortunately existing collation software such as CollateX1 and Juxta2’s collation engine do not have interpretation modules. In fact they implement the Gothenburg model [1] for collation process which does not include an interpretation unit. Currently both CollateX and Juxta’s collation engine do not distinguish in their critical apparatus between the types of changes, and do not offer statistics about those changes. This paper presents a model for both integrated and distributed collation processes that improves the Gothenburg model. The model introduces an interpretation component for computing and distinguishing between the types of changes that documents could have undergone. Moreover two architectures implementing the model in order to solve the problem of interactive collation are discussed as well. Each architecture uses CollateX library, and provides on the one hand preprocessing functions for transforming input documents into CollateX input format, and on the other hand a post-processing module for enabling interactive collation. Finally simple algorithms for distinguishing between types of changes, and linking collated source documents with the collation results are also introduced.

10 to 10

004 Datenverarbeitung; Informatik

Refine

Has Fulltext

Is part of the Bibliography

Year of publication

Document Type

Language

Keywords

Author

Institute

19 search hits