004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (17)
Is part of the Bibliography
- yes (17)
Year of publication
- 2011 (17) (remove)
Document Type
- Preprint (10)
- Doctoral Thesis (5)
- Journal article (2)
Keywords
- Quran (7)
- Koran (6)
- Text Mining (6)
- Bayesian classifier (3)
- Textvergleich (3)
- Base text (2)
- Content Management (2)
- Gothenburg model (2)
- Knowledge Management (2)
- Maschinelles Lernen (2)
- Meta-model (2)
- Text mining (2)
- Textual alterations weighting system (2)
- Textual document collation (2)
- Visualisierung (2)
- Wissensmanagement (2)
- 26S RDNA Data (1)
- Approximationsalgorithmus (1)
- Aufwandsanalyse (1)
- Automatisierte Prüfungskorrektur (1)
- Barcodes (1)
- Bayes-Klassifikator (1)
- Bit Parallelität (1)
- Bodenstation (1)
- Boolean Grammar (1)
- Boolesche Grammatik (1)
- Causes of revelation (1)
- Chapters arrangement (1)
- Chronology of revelation (1)
- Clustering (1)
- Colonial volvocales chlorophyta (1)
- Cost Analysis (1)
- DNA (1)
- Dasycladales chlorophyta (1)
- Dienstgüte (1)
- Distributed Space Systems (1)
- Drahtloses Sensorsystem (1)
- Drahtloses vermaschtes Netz (1)
- Educational Measurement (I2.399) (1)
- Energieeffizienz (1)
- Energy efficiency (1)
- Fairness (1)
- Frames (1)
- Gothenburg Modell (1)
- Gothenburg model of collation process (1)
- Ground Station Networks (1)
- IEEE 802.11 (1)
- IEEE 802.15.4 (1)
- Information Retrieval (1)
- Information Visualization (1)
- Information-Retrieval-System (1)
- Invertierte Liste (1)
- Kleinsatellit (1)
- Knowledge Modeling (1)
- Knowledge representation (1)
- Komplexitätstheorie (1)
- Konzeptsuche (1)
- Land plants (1)
- Lawhul-Mahfuz (1)
- Link rate adaptation (1)
- Linkratenanpassung (1)
- Mehrkriterielle Optimierung (1)
- Modellierung (1)
- Molecular systematics (1)
- Multiple-Choice Examination (1)
- Multiple-Choice Prüfungen (1)
- Naïve Bayesian (1)
- Network Management (1)
- Network Virtualization (1)
- Netzwerkmanagement (1)
- Netzwerkvirtualisierung (1)
- Nuclear RDNA (1)
- Optimierung (1)
- Place of revelation (1)
- Profile distances (1)
- QoE (1)
- QoS (1)
- Quality of Experience (1)
- Quality of Service (1)
- RBCL Gene-sequences (1)
- Reconstruction of original text (1)
- Scatter Plot (1)
- Scheduling (1)
- Secondary structure (1)
- Self-Evaluation Programs (I2.399.780) (1)
- Small Satellites (1)
- Softwarearchitektur (1)
- Stages of Prophet Mohammad’s messengership (1)
- Statistical classifiers (1)
- Suchverfahren (1)
- Support Vector Machine (1)
- Text categorization (1)
- Text segmentation (1)
- Theoretische Informatik (1)
- Travelling-salesman-Problem (1)
- Verteiltes System (1)
- Visual Text Mining (1)
- Visualization (1)
- Volltextsuche (1)
- Wissensrepräsentation (1)
- XML (1)
- XML model (1)
- bit-parallel (1)
- concept search (1)
- distance-based classifier (1)
- full-text search (1)
- interactive collation of textual variants (1)
- n-Gramm (1)
- n-gram (1)
- q-Gramm (1)
- q-gram (1)
- service based software architecture (1)
- service brokerage (1)
- text categorization (1)
A Knowledge-based Hybrid Statistical Classifier for Reconstructing the Chronology of the Quran
(2011)
Computationally categorizing Quran’s chapters has been mainly confined to the determination of chapters’ revelation places. However this broad classification is not sufficient to effectively and thoroughly understand and interpret the Quran. The chronology of revelation would not only improve comprehending the philosophy of Islam, but also the easiness of implementing and memorizing its laws and recommendations. This paper attempts estimating possible chapters’ dates of revelation through their lexical frequency profiles. A hybrid statistical classifier consisting of stemming and clustering algorithms for comparing lexical frequency profiles of chapters, and deriving dates of revelation has been developed. The classifier is trained using some chapters with known dates of revelation. Then it classifies chapters with uncertain dates of revelation by computing their proximity to the training ones. The results reported here indicate that the proposed methodology yields usable results in estimating dates of revelation of the Quran’s chapters based on their lexical contents.
Given a collection of diverging documents about some lost original text, any person interested in the text would try reconstructing it from the diverging documents. Whether it is eclecticism, stemmatics, or copy-text, one is expected to explicitly or indirectly select one of the documents as a starting point or as a base text, which could be emended through comparison with remaining documents, so that a text that could be designated as the original document is generated. Unfortunately the process of giving priority to one of the documents also known as witnesses is a subjective approach. In fact even Cladistics, which could be considered as a computer-based approach of implementing stemmatics, does not present or recommend users to select a certain witness as a starting point for the process of reconstructing the original document. In this study, a computational method using a rule-based Bayesian classifier is used, to assist text scholars in their attempts of reconstructing a non-existing document from some available witnesses. The method developed in this study consists of selecting a base text successively and collating it with remaining documents. Each completed collation cycle stores the selected base text and its closest witness, along with a weighted score of their similarities and differences. At the end of the collation process, a witness selected more often by majority of base texts is considered as the probable base text of the collection. Witnesses’ scores are weighted using a weighting system, based on effects of types of textual modifications on the process of reconstructing original documents. Users have the possibility to select between baseless and base text collation. If a base text is selected, the task is reduced to ranking the witnesses with respect to the base text, otherwise a base text as well as ranking of the witnesses with respect to the base text are computed and displayed on a histogram.
The Quran is the holy book of Islam consisting of 6236 verses divided into 114 chapters called suras. Many verses are similar and even identical. Searching for similar texts (e.g verses) could return thousands of verses, that when displayed completely or partly as textual list would make analysis and understanding difficult and confusing. Moreover it would be visually impossible to instantly figure out the overall distribution of the retrieved verses in the Quran. As consequence reading and analyzing the verses would be tedious and unintuitive. In this study a combination of interactive scatter plots and tables has been developed to assist analysis and understanding of the search result. Retrieved verses are clustered by chapters, and a weight is assigned to each cluster according to number of verses it contains, so that users could visually identify most relevant areas, and figure out the places of revelation of the verses. Users visualize the complete result and can select a region of the plot to zoom in, click on a marker to display a table containing verses with English translation side by side.
Learning a book in general involves reading it, underlining important words, adding comments, summarizing some passages, and marking up some text or concepts. Once deeper understanding is achieved, one would like to organize and manage her/his knowledge in such a way that, it could be easily remembered and efficiently transmitted to others. In this paper, books organized in terms of chapters consisting of verses, are considered as the source of knowledge to be modeled. The knowledge model consists of verses with their metadata and semantic annotations. The metadata represent the multiple perspectives of knowledge modeling. Verses with their metadata and annotations form a meta-model, which will be published on a web Mashup. The meta-model with linking between its elements constitute a knowledge base. An XML-based annotation system breaking down the learning process into specific tasks, helps constructing the desired meta-model. The system is made up of user interfaces for creating metadata, annotating chapters’ contents according to user selected semantics, and templates for publishing the generated knowledge on the Internet. The proposed software system improves comprehension and retention of knowledge contained in religious texts through modeling and visualization. The system has been applied to the Quran, and the result obtained shows that multiple perspectives of information modeling can be successfully applied to religious texts. It is expected that this short ongoing study would motivate others to engage in devising and offering software systems for cross-religions learning.
Einleitung:
Multiple-Choice-Klausuren spielen immer noch eine herausragende Rolle für fakultätsinterne medizinische Prüfungen. Neben inhaltlichen Arbeiten stellt sich die Frage, wie die technische Abwicklung optimiert werden kann. Für Dozenten in der Medizin gibt es zunehmend drei Optionen zur Durchführung von MC-Klausuren: Papierklausuren mit oder ohne Computerunterstützung oder vollständig elektronische Klausuren. Kritische Faktoren sind der Aufwand für die Formatierung der Klausur, der logistische Aufwand bei der Klausurdurchführung, die Qualität, Schnelligkeit und der Aufwand der Klausurkorrektur, die Bereitstellung der Dokumente für die Einsichtnahme, und die statistische Analyse der Klausurergebnisse.
Methoden:
An der Universität Würzburg wird seit drei Semestern ein Computerprogramm zur Eingabe und Formatierung der MC-Fragen in medizinischen und anderen Papierklausuren verwendet und optimiert, mit dem im Wintersemester (WS) 2009/2010 elf, im Sommersemester (SS) 2010 zwölf und im WS 2010/11 dreizehn medizinische Klausuren erstellt und anschließend die eingescannten Antwortblätter automatisch ausgewertet wurden. In den letzten beiden Semestern wurden die Aufwände protokolliert.
Ergebnisse:
Der Aufwand der Formatierung und der Auswertung einschl. nachträglicher Anpassung der Auswertung einer Durchschnittsklausur mit ca. 140 Teilnehmern und ca. 35 Fragen ist von 5-7 Stunden für Klausuren ohne Komplikation im WS 2009/2010 über ca. 2 Stunden im SS 2010 auf ca. 1,5 Stunden im WS 2010/11 gefallen. Einschließlich der Klausuren mit Komplikationen bei der Auswertung betrug die durchschnittliche Zeit im SS 2010 ca. 3 Stunden und im WS 10/11 ca. 2,67 Stunden pro Klausur.
Diskussion:
Für konventionelle Multiple-Choice-Klausuren bietet die computergestützte Formatierung und Auswertung von Papierklausuren einen beträchtlichen Zeitvorteil für die Dozenten im Vergleich zur manuellen Korrektur von Papierklausuren und benötigt im Vergleich zu rein elektronischen Klausuren eine deutlich einfachere technische Infrastruktur und weniger Personal bei der Klausurdurchführung.
Given a collection of diverging documents about some lost original text, any person interested in the text would try reconstructing it from the diverging documents. Whether it is eclecticism, stemmatics, or copy-text, one is expected to explicitly or indirectly select one of the documents as a starting point or as a base text, which could be emended through comparison with remaining documents, so that a text that could be designated as the original document is generated. Unfortunately the process of giving priority to one of the documents also known as witnesses is a subjective approach. In fact even Cladistics, which could be considered as a computer-based approach of implementing stemmatics, does not present or recommend users to select a certain witness as a starting point for the process of reconstructing the original document. In this study, a computational method using a rule-based Bayesian classifier is used, to assist text scholars in their attempts of reconstructing a non-existing document from some available witnesses. The method developed in this study consists of selecting a base text successively and collating it with remaining documents. Each completed collation cycle stores the selected base text and its closest witness, along with a weighted score of their similarities and differences. At the end of the collation process, a witness selected more often by majority of base texts is considered as the probable base text of the collection. Witnesses’ scores are weighted using a weighting system, based on effects of types of textual modifications on the process of reconstructing original documents. Users have the possibility to select between baseless and base text collation. If a base text is selected, the task is reduced to ranking the witnesses with respect to the base text, otherwise a base text as well as ranking of the witnesses with respect to the base text are computed and displayed on a bar diagram. Additionally this study includes a recursive algorithm for automatically reconstructing the original text from the identified base text and ranked witnesses.
Computing Generic Causes of Revelation of the Quranic Verses Using Machine Learning Techniques
(2011)
Because many verses of the holy Quran are similar, there is high probability that, similar verses addressing same issues share same generic causes of revelation. In this study, machine learning techniques have been employed in order to automatically derive causes of revelation of Quranic verses. The derivation of the causes of revelation is viewed as a classification problem. Initially the categories are based on the verses with known causes of revelation, and the testing set consists of the remaining verses. Based on a computed threshold value, a naïve Bayesian classifier is used to categorize some verses. After that, using a decision tree classifier the remaining uncategorized verses are separated into verses that contain indicators (resultative connectors, causative expressions…), and those that do not. As for those verses having indicators, each one is segmented into its constituent clauses by identification of the linking indicators. Then a dominant clause is extracted and considered either as the cause of revelation, or post-processed by adding or subtracting some terms to form a causal clause that constitutes the cause of revelation. Concerning remaining unclassified verses without indicators, a naive Bayesian classifier is again used to assign each one of them to one of the existing classes based on features and topics similarity. As for verses that could not be classified so far, manual classification was made by considering each verse as a category on its own. The result obtained in this study is encouraging, and shows that automatic derivation of Quranic verses’ generic causes of revelation is achievable, and reasonably reliable for understanding and implementing the teachings of the Quran.
Learning a book in general involves reading it, underlining important words, adding comments, summarizing some passages, and marking up some text or concepts. Once deeper understanding is achieved, one would like to organize and manage her/his knowledge in such a way that, it could be easily remembered and efficiently transmitted to others. This paper discusses about modeling religious texts using semantic XML markup based on frame-based knowledge representation, with the purpose of assisting understanding, retention, and sharing of knowledge they contain. In this study, books organized in terms of chapters made up of verses are considered as the source of knowledge to model. Some metadata representing the multiple perspectives of knowledge modeling are assigned to each chapter and verse. Chapters and verses with their metadata form a meta-model, which is represented using frames, and published on a web mashup. An XML-based annotation and visualization system equipped with user interfaces for creating static and dynamic metadata, annotating chapters’ contents according to user selected semantics, and templates for publishing generated knowledge on the Internet, has been developed. The system has been applied to the Quran, and the result obtained shows that multiple perspectives of information modeling can be successfully applied to religious texts, in order to support analysis, understanding, and retention of the texts.
Design and Implementation of Architectures for Interactive Textual Documents Collation Systems
(2011)
One of the main purposes of textual documents collation is to identify a base text or closest witness to the base text, by analyzing and interpreting differences also known as types of changes that might exist between those documents. Based on this fact, it is reasonable to argue that, explicit identification of types of changes such as deletions, additions, transpositions, and mutations should be part of the collation process. The identification could be carried out by an interpretation module after alignment has taken place. Unfortunately existing collation software such as CollateX1 and Juxta2’s collation engine do not have interpretation modules. In fact they implement the Gothenburg model [1] for collation process which does not include an interpretation unit. Currently both CollateX and Juxta’s collation engine do not distinguish in their critical apparatus between the types of changes, and do not offer statistics about those changes. This paper presents a model for both integrated and distributed collation processes that improves the Gothenburg model. The model introduces an interpretation component for computing and distinguishing between the types of changes that documents could have undergone. Moreover two architectures implementing the model in order to solve the problem of interactive collation are discussed as well. Each architecture uses CollateX library, and provides on the one hand preprocessing functions for transforming input documents into CollateX input format, and on the other hand a post-processing module for enabling interactive collation. Finally simple algorithms for distinguishing between types of changes, and linking collated source documents with the collation results are also introduced.
The field of small satellite formations and constellations attracted growing attention, based on recent advances in small satellite engineering. The utilization of distributed space systems allows the realization of innovative applications and will enable improved temporal and spatial resolution in observation scenarios. On the other side, this new paradigm imposes a variety of research challenges. In this monograph new networking concepts for space missions are presented, using networks of ground stations. The developed approaches combine ground station resources in a coordinated way to achieve more robust and efficient communication links. Within this thesis, the following topics were elaborated to improve the performance in distributed space missions: Appropriate scheduling of contact windows in a distributed ground system is a necessary process to avoid low utilization of ground stations. The theoretical basis for the novel concept of redundant scheduling was elaborated in detail. Additionally to the presented algorithm was a scheduling system implemented, its performance was tested extensively with real world scheduling problems. In the scope of data management, a system was developed which autonomously synchronizes data frames in ground station networks and uses this information to detect and correct transmission errors. The system was validated with hardware in the loop experiments, demonstrating the benefits of the developed approach.