004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (19)
Is part of the Bibliography
- yes (19)
Document Type
- Preprint (19) (remove)
Language
- English (19)
Keywords
- Quran (8)
- Koran (7)
- Text Mining (7)
- XML (4)
- Bayesian classifier (3)
- Softwarearchitektur (3)
- Textvergleich (3)
- Visualisierung (3)
- Wissensmanagement (3)
- Base text (2)
- CSS (2)
- Cascading Style Sheets (2)
- Content Management (2)
- Gothenburg model (2)
- Knowledge Management (2)
- Maschinelles Lernen (2)
- Meta-model (2)
- Text mining (2)
- Textual alterations weighting system (2)
- Textual document collation (2)
- Visualization (2)
- Wrapper <Programmierung> (2)
- cosmology (2)
- Bayes-Klassifikator (1)
- Causes of revelation (1)
- Chapters arrangement (1)
- Chronology of revelation (1)
- Clustering (1)
- E8 symmetry (1)
- Fragmentation (1)
- Fragmentierung (1)
- Frames (1)
- Gothenburg Modell (1)
- Gothenburg model of collation process (1)
- HTML (1)
- Hurwitz theorem (1)
- Information Visualization (1)
- JSF (1)
- Java Frameworks (1)
- Knowledge Management System (1)
- Knowledge Modeling (1)
- Knowledge representation (1)
- Knowledge-based System (1)
- Lawhul-Mahfuz (1)
- Lee Smolin (1)
- MVC <Software> (1)
- Mashup (1)
- Mashup <Internet> (1)
- Naïve Bayesian (1)
- Overlapping (1)
- Place of revelation (1)
- Processing Model (1)
- Processing model (1)
- Reconstruction of original text (1)
- Reference Architecture (1)
- Scatter Plot (1)
- Software architecture (1)
- Software design (1)
- Spring (1)
- Stages of Prophet Mohammad’s messengership (1)
- Statistical classifiers (1)
- Struts (1)
- Support Vector Machine (1)
- Text categorization (1)
- Text segmentation (1)
- Visual Text Mining (1)
- Web service (1)
- Webservice Composition (1)
- Wissensbanksystem (1)
- Wissensrepräsentation (1)
- Wrapper (1)
- Wrappers (1)
- XML model (1)
- bit (1)
- crystal growth (1)
- crystallization (1)
- distance-based classifier (1)
- emergent time (1)
- evolution (1)
- heuristics (1)
- inflation (1)
- interactive collation of textual variants (1)
- phase space (1)
- phase transition (1)
- qubit (1)
- service based software architecture (1)
- service brokerage (1)
- text categorization (1)
- Überlappung (1)
Institute
Design and Implementation of Architectures for Interactive Textual Documents Collation Systems
(2011)
One of the main purposes of textual documents collation is to identify a base text or closest witness to the base text, by analyzing and interpreting differences also known as types of changes that might exist between those documents. Based on this fact, it is reasonable to argue that, explicit identification of types of changes such as deletions, additions, transpositions, and mutations should be part of the collation process. The identification could be carried out by an interpretation module after alignment has taken place. Unfortunately existing collation software such as CollateX1 and Juxta2’s collation engine do not have interpretation modules. In fact they implement the Gothenburg model [1] for collation process which does not include an interpretation unit. Currently both CollateX and Juxta’s collation engine do not distinguish in their critical apparatus between the types of changes, and do not offer statistics about those changes. This paper presents a model for both integrated and distributed collation processes that improves the Gothenburg model. The model introduces an interpretation component for computing and distinguishing between the types of changes that documents could have undergone. Moreover two architectures implementing the model in order to solve the problem of interactive collation are discussed as well. Each architecture uses CollateX library, and provides on the one hand preprocessing functions for transforming input documents into CollateX input format, and on the other hand a post-processing module for enabling interactive collation. Finally simple algorithms for distinguishing between types of changes, and linking collated source documents with the collation results are also introduced.