004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (17)
Is part of the Bibliography
- yes (17)
Document Type
- Preprint (17)
Language
- English (17) (remove)
Keywords
- Quran (8)
- Koran (7)
- Text Mining (7)
- XML (4)
- Bayesian classifier (3)
- Softwarearchitektur (3)
- Textvergleich (3)
- Visualisierung (3)
- Wissensmanagement (3)
- Base text (2)
- CSS (2)
- Cascading Style Sheets (2)
- Content Management (2)
- Gothenburg model (2)
- Knowledge Management (2)
- Maschinelles Lernen (2)
- Meta-model (2)
- Text mining (2)
- Textual alterations weighting system (2)
- Textual document collation (2)
- Visualization (2)
- Wrapper <Programmierung> (2)
- Bayes-Klassifikator (1)
- Causes of revelation (1)
- Chapters arrangement (1)
- Chronology of revelation (1)
- Clustering (1)
- Fragmentation (1)
- Fragmentierung (1)
- Frames (1)
- Gothenburg Modell (1)
- Gothenburg model of collation process (1)
- HTML (1)
- Information Visualization (1)
- JSF (1)
- Java Frameworks (1)
- Knowledge Management System (1)
- Knowledge Modeling (1)
- Knowledge representation (1)
- Knowledge-based System (1)
- Lawhul-Mahfuz (1)
- MVC <Software> (1)
- Mashup (1)
- Mashup <Internet> (1)
- Naïve Bayesian (1)
- Overlapping (1)
- Place of revelation (1)
- Processing Model (1)
- Processing model (1)
- Reconstruction of original text (1)
- Reference Architecture (1)
- Scatter Plot (1)
- Software architecture (1)
- Software design (1)
- Spring (1)
- Stages of Prophet Mohammad’s messengership (1)
- Statistical classifiers (1)
- Struts (1)
- Support Vector Machine (1)
- Text categorization (1)
- Text segmentation (1)
- Visual Text Mining (1)
- Web service (1)
- Webservice Composition (1)
- Wissensbanksystem (1)
- Wissensrepräsentation (1)
- Wrapper (1)
- Wrappers (1)
- XML model (1)
- distance-based classifier (1)
- interactive collation of textual variants (1)
- service based software architecture (1)
- service brokerage (1)
- text categorization (1)
- Überlappung (1)
Institute
- Institut für deutsche Philologie (17) (remove)
Given a collection of diverging documents about some lost original text, any person interested in the text would try reconstructing it from the diverging documents. Whether it is eclecticism, stemmatics, or copy-text, one is expected to explicitly or indirectly select one of the documents as a starting point or as a base text, which could be emended through comparison with remaining documents, so that a text that could be designated as the original document is generated. Unfortunately the process of giving priority to one of the documents also known as witnesses is a subjective approach. In fact even Cladistics, which could be considered as a computer-based approach of implementing stemmatics, does not present or recommend users to select a certain witness as a starting point for the process of reconstructing the original document. In this study, a computational method using a rule-based Bayesian classifier is used, to assist text scholars in their attempts of reconstructing a non-existing document from some available witnesses. The method developed in this study consists of selecting a base text successively and collating it with remaining documents. Each completed collation cycle stores the selected base text and its closest witness, along with a weighted score of their similarities and differences. At the end of the collation process, a witness selected more often by majority of base texts is considered as the probable base text of the collection. Witnesses’ scores are weighted using a weighting system, based on effects of types of textual modifications on the process of reconstructing original documents. Users have the possibility to select between baseless and base text collation. If a base text is selected, the task is reduced to ranking the witnesses with respect to the base text, otherwise a base text as well as ranking of the witnesses with respect to the base text are computed and displayed on a bar diagram. Additionally this study includes a recursive algorithm for automatically reconstructing the original text from the identified base text and ranked witnesses.