OPUS Würzburg

Refine

Has Fulltext

yes (7)

7 search hits

1 to 7

Sort by

Privacy aware social information retrieval and spam filtering using folksonomies (2015)

Social interactions as introduced by Web 2.0 applications during the last decade have changed the way the Internet is used. Today, it is part of our daily lives to maintain contacts through social networks, to comment on the latest developments in microblogging services or to save and share information snippets such as photos or bookmarks online. Social bookmarking systems are part of this development. Users can share links to interesting web pages by publishing bookmarks and providing descriptive keywords for them. The structure which evolves from the collection of annotated bookmarks is called a folksonomy. The sharing of interesting and relevant posts enables new ways of retrieving information from the Web. Users can search or browse the folksonomy looking at resources related to specific tags or users. Ranking methods known from search engines have been adjusted to facilitate retrieval in social bookmarking systems. Hence, social bookmarking systems have become an alternative or addendum to search engines. In order to better understand the commonalities and differences of social bookmarking systems and search engines, this thesis compares several aspects of the two systems' structure, usage behaviour and content. This includes the use of tags and query terms, the composition of the document collections and the rankings of bookmarks and search engine URLs. Searchers (recorded via session ids), their search terms and the clicked on URLs can be extracted from a search engine query logfile. They form similar links as can be found in folksonomies where a user annotates a resource with tags. We use this analogy to build a tripartite hypergraph from query logfiles (a logsonomy), and compare structural and semantic properties of log- and folksonomies. Overall, we have found similar behavioural, structural and semantic characteristics in both systems. Driven by this insight, we investigate, if folksonomy data can be of use in web information retrieval in a similar way to query log data: we construct training data from query logs and a folksonomy to build models for a learning-to-rank algorithm. First experiments show a positive correlation of ranking results generated from the ranking models of both systems. The research is based on various data collections from the social bookmarking systems BibSonomy and Delicious, Microsoft's search engine MSN (now Bing) and Google data. To maintain social bookmarking systems as a good source for information retrieval, providers need to fight spam. This thesis introduces and analyses different features derived from the specific characteristics of social bookmarking systems to be used in spam detection classification algorithms. Best results can be derived from a combination of profile, activity, semantic and location-based features. Based on the experiments, a spam detection framework which identifies and eliminates spam activities for the social bookmarking system BibSonomy has been developed. The storing and publication of user-related bookmarks and profile information raises questions about user data privacy. What kinds of personal information is collected and how do systems handle user-related items? In order to answer these questions, the thesis looks into the handling of data privacy in the social bookmarking system BibSonomy. Legal guidelines about how to deal with the private data collected and processed in social bookmarking systems are also presented. Experiments will show that the consideration of user data privacy in the process of feature design can be a first step towards strengthening data privacy.

Unified Monitoring of Spacecrafts (2015)

Dannemann, Frank

Within this thesis a new philosophy in monitoring spacecrafts is presented: the unification of the various kinds of monitoring techniques used during the different lifecylce phases of a spacecraft. The challenging requirements being set for this monitoring framework are: - "separation of concerns" as a design principle (dividing the steps of logging from registered sources, sending to connected sinks and displaying of information), - usage during all mission phases, - usage by all actors (EGSE engineers, groundstation operators, etc.), - configurable at runtime, especially regarding the level of detail of logging information, and - very low resource consumption. First a prototype of the monitoring framework was developed as a support library for the real-time operating system RODOS. This prototype was tested on dedicated hardware platforms relevant for space, and also on a satellite demonstrator used for educational purposes. As a second step, the results and lessons learned from the development and usage of this prototype were transfered to a real space mission: the first satellite of the DLR compact satellite series - a space based platform for DLR's own research activities. Within this project, the software of the avionic subsystem was supplemented by a powerful logging component, which enhances the traditional housekeeping capabilities and offers extensive filtering and debugging techniques for monitoring and FDIR needs. This logging component is the major part of the flight version of the monitoring framework. It is completed by counterparts running on the development computers and as well as the EGSE hardware in the integration room, making it most valuable already in the earliest stages of traditional spacecraft development. Future plans in terms of adding support from the groundstation as well will lead to a seamless integration of the monitoring framework not only into to the spacecraft itself, but into the whole space system.

Dynamic Label Placement in Practice (2015)

Schwartges, Nadine

The general map-labeling problem is as follows: given a set of geometric objects to be labeled, or features, in the plane, and for each feature a set of label positions, maximize the number of placed labels such that there is at most one label per feature and no two labels overlap. There are three types of features in a map: point, line, and area features. Unfortunately, one cannot expect to find efficient algorithms that solve the labeling problem optimally. Interactive maps are digital maps that only show a small part of the entire map whereas the user can manipulate the shown part, the view, by continuously panning, zooming, rotating, and tilting (that is, changing the perspective between a top and a bird view). An example for the application of interactive maps is in navigational devices. Interactive maps are challenging in that the labeling must be updated whenever labels leave the view and, while zooming, the label size must be constant on the screen (which either makes space for further labels or makes labels overlap when zooming in or out, respectively). These updates must be computed in real time, that is, the computation must be so fast that the user does not notice that we spend time on the computation. Additionally, labels must not jump or flicker, that is, labels must not suddenly change their positions or, while zooming out, a vanished label must not appear again. In this thesis, we present efficient algorithms that dynamically label point and line features in interactive maps. We try to label as many features as possible while we prohibit labels that overlap, jump, and flicker. We have implemented all our approaches and tested them on real-world data. We conclude that our algorithms are indeed real-time capable.

UI-, User-, & Usability-Oriented Engineering of Participative Knowledge-Based Systems (2015)

Freiberg, Martina

Knowledge-based systems (KBS) face an ever-increasing interest in various disciplines and contexts. Yet, the former aim to construct the ’perfect intelligent software’ continuously shifts to user-centered, participative solutions. Such systems enable users to contribute their personal knowledge to the problem solving process for increased efficiency and an ameliorated user experience. More precisely, we define non-functional key requirements of participative KBS as: Transparency (encompassing KBS status mediation), configurability (user adaptability, degree of user control/exploration), quality of the KB and UI, and evolvability (enabling the KBS to grow mature with their users). Many of those requirements depend on the respective target users, thus calling for a more user-centered development. Often, also highly expertise domains are targeted — inducing highly complex KBs — which requires a more careful and considerate UI/interaction design. Still, current KBS engineering (KBSE) approaches mostly focus on knowledge acquisition (KA) This often leads to non-optimal, little reusable, and non/little evaluated KBS front-end solutions. In this thesis we propose a more encompassing KBSE approach. Due to the strong mutual influences between KB and UI, we suggest a novel form of intertwined UI and KB development. We base the approach on three core components for encompassing KBSE: (1) Extensible prototyping, a tailored form of evolutionary prototyping; this builds on mature UI prototypes and offers two extension steps for the anytime creation of core KBS prototypes (KB + core UI) and fully productive KBS (core KBS prototype + common framing functionality). (2) KBS UI patterns, that define reusable solutions for the core KBS UI/interaction; we provide a basic collection of such patterns in this work. (3) Suitable usability instruments for the assessment of the KBS artifacts. Therewith, we do not strive for ’yet another’ self-contained KBS engineering methodology. Rather, we motivate to extend existing approaches by the proposed key components. We demonstrate this based on an agile KBSE model. For practical support, we introduce the tailored KBSE tool ProKEt. ProKEt offers a basic selection of KBS core UI patterns and corresponding configuration options out of the box; their further adaption/extension is possible on various levels of expertise. For practical usability support, ProKEt offers facilities for quantitative and qualitative data collection. ProKEt explicitly fosters the suggested, intertwined development of UI and KB. For seamlessly integrating KA activities, it provides extension points for two selected external KA tools: For KnowOF, a standard office based KA environment. And for KnowWE, a semantic wiki for collaborative KA. Therewith, ProKEt offers powerful support for encompassing, user-centered KBSE. Finally, based on the approach and the tool, we also developed a novel KBS type: Clarification KBS as a mashup of consultation and justification KBS modules. Those denote a specifically suitable realization for participative KBS in highly expertise contexts and consequently require a specific design. In this thesis, apart from more common UI solutions, we particularly also introduce KBS UI patterns especially tailored towards Clarification KBS.

Optimization and Design of Network Architectures for Future Internet Routing (2015)

Hartmann, Matthias

At the center of the Internet’s protocol stack stands the Internet Protocol (IP) as a common denominator that enables all communication. To make routing efficient, resilient, and scalable, several aspects must be considered. Care must be taken that traffic is well balanced to make efficient use of the existing network resources, both in failure free operation and in failure scenarios. Finding the optimal routing in a network is an NP-complete problem. Therefore, routing optimization is usually performed using heuristics. This dissertation shows that a routing optimized with one objective function is often not good when looking at other objective functions. It can even be worse than unoptimized routing with respect to that objective function. After looking at failure-free routing and traffic distribution in different failure scenarios, the analysis is extended to include the loop-free alternate (LFA) IP fast reroute mechanism. Different application scenarios of LFAs are examined and a special focus is set on the fact that LFAs usually cannot protect all traffic in a network even against single link failures. Thus, the routing optimization for LFAs is targeted on both link utilization and failure coverage. Finally, the pre-congestion notification mechanism PCN for network admission control and overload protection is analyzed and optimized. Different design options for implementing the protocol are compared, before algorithms are developed for the calculation and optimization of protocol parameters and PCN-based routing. The second part of the thesis tackles a routing problem that can only be resolved on a global scale. The scalability of the Internet is at risk since a major and intensifying growth of the interdomain routing tables has been observed. Several protocols and architectures are analyzed that can be used to make interdomain routing more scalable. The most promising approach is the locator/identifier (Loc/ID) split architecture which separates routing from host identification. This way, changes in connectivity, mobility of end hosts, or traffic-engineering activities are hidden from the routing in the core of the Internet and the routing tables can be kept much smaller. All of the currently proposed Loc/ID split approaches have their downsides. In particular, the fact that most architectures use the ID for routing outside the Internet’s core is a poor design, which inhibits many of the possible features of a new routing architecture. To better understand the problems and to provide a solution for a scalable routing design that implements a true Loc/ID split, the new GLI-Split protocol is developed in this thesis, which provides separation of global and local routing and uses an ID that is independent from any routing decisions. Besides GLI-Split, several other new routing architectures implementing Loc/ID split have been proposed for the Internet. Most of them assume that a mapping system is queried for EID-to-RLOC mappings by an intermediate node at the border of an edge network. When the mapping system is queried by an intermediate node, packets are already on their way towards their destination, and therefore, the mapping system must be fast, scalable, secure, resilient, and should be able to relay packets without locators to nodes that can forward them to the correct destination. The dissertation develops a classification for all proposed mapping system architectures and shows their similarities and differences. Finally, the fast two-level mapping system FIRMS is developed. It includes security and resilience features as well as a relay service for initial packets of a flow when intermediate nodes encounter a cache miss for the EID-to-RLOC mapping.

Performance Assessment of Resource Management Strategies for Cellular and Wireless Mesh Networks (2015)

Wamser, Florian

The rapid growth in the field of communication networks has been truly amazing in the last decades. We are currently experiencing a continuation thereof with an increase in traffic and the emergence of new fields of application. In particular, the latter is interesting since due to advances in the networks and new devices, such as smartphones, tablet PCs, and all kinds of Internet-connected devices, new additional applications arise from different areas. What applies for all these services is that they come from very different directions and belong to different user groups. This results in a very heterogeneous application mix with different requirements and needs on the access networks. The applications within these networks typically use the network technology as a matter of course, and expect that it works in all situations and for all sorts of purposes without any further intervention. Mobile TV, for example, assumes that the cellular networks support the streaming of video data. Likewise, mobile-connected electricity meters rely on the timely transmission of accounting data for electricity billing. From the perspective of the communication networks, this requires not only the technical realization for the individual case, but a broad consideration of all circumstances and all requirements of special devices and applications of the users. Such a comprehensive consideration of all eventualities can only be achieved by a dynamic, customized, and intelligent management of the transmission resources. This management requires to exploit the theoretical capacity as much as possible while also taking system and network architecture as well as user and application demands into account. Hence, for a high level of customer satisfaction, all requirements of the customers and the applications need to be considered, which requires a multi-faceted resource management. The prerequisite for supporting all devices and applications is consequently a holistic resource management at different levels. At the physical level, the technical possibilities provided by different access technologies, e.g., more transmission antennas, modulation and coding of data, possible cooperation between network elements, etc., need to be exploited on the one hand. On the other hand, interference and changing network conditions have to be counteracted at physical level. On the application and user level, the focus should be on the customer demands due to the currently increasing amount of different devices and diverse applications (medical, hobby, entertainment, business, civil protection, etc.). The intention of this thesis is the development, investigation, and evaluation of a holistic resource management with respect to new application use cases and requirements for the networks. Therefore, different communication layers are investigated and corresponding approaches are developed using simulative methods as well as practical emulation in testbeds. The new approaches are designed with respect to different complexity and implementation levels in order to cover the design space of resource management in a systematic way. Since the approaches cannot be evaluated generally for all types of access networks, network-specific use cases and evaluations are finally carried out in addition to the conceptual design and the modeling of the scenario. The first part is concerned with management of resources at physical layer. We study distributed resource allocation approaches under different settings. Due to the ambiguous performance objectives, a high spectrum reuse is conducted in current cellular networks. This results in possible interference between cells that transmit on the same frequencies. The focus is on the identification of approaches that are able to mitigate such interference. Due to the heterogeneity of the applications in the networks, increasingly different application-specific requirements are experienced by the networks. Consequently, the focus is shifted in the second part from optimization of network parameters to consideration and integration of the application and user needs by adjusting network parameters. Therefore, application-aware resource management is introduced to enable efficient and customized access networks. As indicated before, approaches cannot be evaluated generally for all types of access networks. Consequently, the third contribution is the definition and realization of the application-aware paradigm in different access networks. First, we address multi-hop wireless mesh networks. Finally, we focus with the fourth contribution on cellular networks. Application-aware resource management is applied here to the air interface between user device and the base station. Especially in cellular networks, the intensive cost-driven competition among the different operators facilitates the usage of such a resource management to provide cost-efficient and customized networks with respect to the running applications.

Context-specific Consistencies in Information Extraction: Rule-based and Probabilistic Approaches (2015)

Klügl, Peter

Large amounts of communication, documentation as well as knowledge and information are stored in textual documents. Most often, these texts like webpages, books, tweets or reports are only available in an unstructured representation since they are created and interpreted by humans. In order to take advantage of this huge amount of concealed information and to include it in analytic processes, it needs to be transformed into a structured representation. Information extraction considers exactly this task. It tries to identify well-defined entities and relations in unstructured data and especially in textual documents. Interesting entities are often consistently structured within a certain context, especially in semi-structured texts. However, their actual composition varies and is possibly inconsistent among different contexts. Information extraction models stay behind their potential and return inferior results if they do not consider these consistencies during processing. This work presents a selection of practical and novel approaches for exploiting these context-specific consistencies in information extraction tasks. The approaches direct their attention not only to one technique, but are based on handcrafted rules as well as probabilistic models. A new rule-based system called UIMA Ruta has been developed in order to provide optimal conditions for rule engineers. This system consists of a compact rule language with a high expressiveness and strong development support. Both elements facilitate rapid development of information extraction applications and improve the general engineering experience, which reduces the necessary efforts and costs when specifying rules. The advantages and applicability of UIMA Ruta for exploiting context-specific consistencies are illustrated in three case studies. They utilize different engineering approaches for including the consistencies in the information extraction task. Either the recall is increased by finding additional entities with similar composition, or the precision is improved by filtering inconsistent entities. Furthermore, another case study highlights how transformation-based approaches are able to correct preliminary entities using the knowledge about the occurring consistencies. The approaches of this work based on machine learning rely on Conditional Random Fields, popular probabilistic graphical models for sequence labeling. They take advantage of a consistency model, which is automatically induced during processing the document. The approach based on stacked graphical models utilizes the learnt descriptions as feature functions that have a static meaning for the model, but change their actual function for each document. The other two models extend the graph structure with additional factors dependent on the learnt model of consistency. They include feature functions for consistent and inconsistent entities as well as for additional positions that fulfill the consistencies. The presented approaches are evaluated in three real-world domains: segmentation of scientific references, template extraction in curricula vitae, and identification and categorization of sections in clinical discharge letters. They are able to achieve remarkable results and provide an error reduction of up to 30% compared to usually applied techniques.

1 to 7

Refine

Has Fulltext

Is part of the Bibliography

Year of publication

Document Type

Language

Keywords

Author

Institute

Sonstige beteiligte Institutionen

7 search hits