• Contact
    • Imprint
    • Sitemap
      • Deutsch

UNIWUE UBWUE Universitätsbibliothek

  • Home
  • Search
  • Browse
  • Publish
  • Help
Schließen
  • Dewey Decimal Classification
  • 0 Informatik, Informationswissenschaft, allgemeine...
  • 00 Informatik, Wissen, Systeme

004 Datenverarbeitung; Informatik

Refine

Has Fulltext

  • yes (2)

Is part of the Bibliography

  • yes (2)

Year of publication

  • 2011 (2)

Document Type

  • Preprint (2)

Language

  • English (2) (remove)

Keywords

  • Textual document collation (2) (remove)

Author

  • Nassourou, Mohamadou (2)

Institute

  • Institut für deutsche Philologie (2)

2 search hits

  • 1 to 2
  • BibTeX
  • CSV
  • RIS
  • XML
  • 10
  • 20
  • 50
  • 100

Sort by

  • Year
  • Year
  • Title
  • Title
  • Author
  • Author
Computer-based Textual Documents Collation System for Reconstructing the Original Text from Automatically Identified Base Text and Ranked Witnesses (2011)
Nassourou, Mohamadou
Given a collection of diverging documents about some lost original text, any person interested in the text would try reconstructing it from the diverging documents. Whether it is eclecticism, stemmatics, or copy-text, one is expected to explicitly or indirectly select one of the documents as a starting point or as a base text, which could be emended through comparison with remaining documents, so that a text that could be designated as the original document is generated. Unfortunately the process of giving priority to one of the documents also known as witnesses is a subjective approach. In fact even Cladistics, which could be considered as a computer-based approach of implementing stemmatics, does not present or recommend users to select a certain witness as a starting point for the process of reconstructing the original document. In this study, a computational method using a rule-based Bayesian classifier is used, to assist text scholars in their attempts of reconstructing a non-existing document from some available witnesses. The method developed in this study consists of selecting a base text successively and collating it with remaining documents. Each completed collation cycle stores the selected base text and its closest witness, along with a weighted score of their similarities and differences. At the end of the collation process, a witness selected more often by majority of base texts is considered as the probable base text of the collection. Witnesses’ scores are weighted using a weighting system, based on effects of types of textual modifications on the process of reconstructing original documents. Users have the possibility to select between baseless and base text collation. If a base text is selected, the task is reduced to ranking the witnesses with respect to the base text, otherwise a base text as well as ranking of the witnesses with respect to the base text are computed and displayed on a bar diagram. Additionally this study includes a recursive algorithm for automatically reconstructing the original text from the identified base text and ranked witnesses.
A Rule-based Statistical Classifier for Determining a Base Text and Ranking Witnesses In Textual Documents Collation Process (2011)
Nassourou, Mohamadou
Given a collection of diverging documents about some lost original text, any person interested in the text would try reconstructing it from the diverging documents. Whether it is eclecticism, stemmatics, or copy-text, one is expected to explicitly or indirectly select one of the documents as a starting point or as a base text, which could be emended through comparison with remaining documents, so that a text that could be designated as the original document is generated. Unfortunately the process of giving priority to one of the documents also known as witnesses is a subjective approach. In fact even Cladistics, which could be considered as a computer-based approach of implementing stemmatics, does not present or recommend users to select a certain witness as a starting point for the process of reconstructing the original document. In this study, a computational method using a rule-based Bayesian classifier is used, to assist text scholars in their attempts of reconstructing a non-existing document from some available witnesses. The method developed in this study consists of selecting a base text successively and collating it with remaining documents. Each completed collation cycle stores the selected base text and its closest witness, along with a weighted score of their similarities and differences. At the end of the collation process, a witness selected more often by majority of base texts is considered as the probable base text of the collection. Witnesses’ scores are weighted using a weighting system, based on effects of types of textual modifications on the process of reconstructing original documents. Users have the possibility to select between baseless and base text collation. If a base text is selected, the task is reduced to ranking the witnesses with respect to the base text, otherwise a base text as well as ranking of the witnesses with respect to the base text are computed and displayed on a histogram.
  • 1 to 2

DINI-Zertifikat     OPUS4 Logo