• search hit 6 of 18
Back to Result List

Computer-based Textual Documents Collation System for Reconstructing the Original Text from Automatically Identified Base Text and Ranked Witnesses

Please always quote using this URN: urn:nbn:de:bvb:20-opus-65749
  • Given a collection of diverging documents about some lost original text, any person interested in the text would try reconstructing it from the diverging documents. Whether it is eclecticism, stemmatics, or copy-text, one is expected to explicitly or indirectly select one of the documents as a starting point or as a base text, which could be emended through comparison with remaining documents, so that a text that could be designated as the original document is generated. Unfortunately the process of giving priority to one of the documents alsoGiven a collection of diverging documents about some lost original text, any person interested in the text would try reconstructing it from the diverging documents. Whether it is eclecticism, stemmatics, or copy-text, one is expected to explicitly or indirectly select one of the documents as a starting point or as a base text, which could be emended through comparison with remaining documents, so that a text that could be designated as the original document is generated. Unfortunately the process of giving priority to one of the documents also known as witnesses is a subjective approach. In fact even Cladistics, which could be considered as a computer-based approach of implementing stemmatics, does not present or recommend users to select a certain witness as a starting point for the process of reconstructing the original document. In this study, a computational method using a rule-based Bayesian classifier is used, to assist text scholars in their attempts of reconstructing a non-existing document from some available witnesses. The method developed in this study consists of selecting a base text successively and collating it with remaining documents. Each completed collation cycle stores the selected base text and its closest witness, along with a weighted score of their similarities and differences. At the end of the collation process, a witness selected more often by majority of base texts is considered as the probable base text of the collection. Witnesses’ scores are weighted using a weighting system, based on effects of types of textual modifications on the process of reconstructing original documents. Users have the possibility to select between baseless and base text collation. If a base text is selected, the task is reduced to ranking the witnesses with respect to the base text, otherwise a base text as well as ranking of the witnesses with respect to the base text are computed and displayed on a bar diagram. Additionally this study includes a recursive algorithm for automatically reconstructing the original text from the identified base text and ranked witnesses.show moreshow less

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar Statistics
Metadaten
Author: Mohamadou Nassourou
URN:urn:nbn:de:bvb:20-opus-65749
Document Type:Preprint
Faculties:Philosophische Fakultät (Histor., philolog., Kultur- und geograph. Wissensch.) / Institut für deutsche Philologie
Language:English
Year of Completion:2011
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
GND Keyword:Textvergleich; Text Mining
Tag:Base text; Bayesian classifier; Gothenburg model; Reconstruction of original text; Textual alterations weighting system; Textual document collation
Release Date:2011/11/11
Licence (German):License LogoDeutsches Urheberrecht