## 004 Datenverarbeitung; Informatik

### Refine

#### Year of publication

#### Document Type

- Doctoral Thesis (61)
- Journal article (33)
- Preprint (18)
- Conference Proceeding (2)
- Report (1)

#### Language

- English (115) (remove)

#### Keywords

- Leistungsbewertung (12)
- Quran (8)
- Robotik (8)
- Koran (7)
- Text Mining (7)
- Mobiler Roboter (6)
- Autonomer Roboter (5)
- Komplexitätstheorie (5)
- Netzwerk (5)
- Theoretische Informatik (5)

#### Institute

- Institut für Informatik (74)
- Theodor-Boveri-Institut für Biowissenschaften (20)
- Institut für deutsche Philologie (17)
- Graduate School of Science and Technology (2)
- Institut für Geographie und Geologie (1)
- Institut für Humangenetik (1)
- Institut für Mathematik (1)
- Institut für Pharmazie und Lebensmittelchemie (1)
- Institut für Theoretische Physik und Astrophysik (1)
- Lehrstuhl für Chemische Technologie der Materialsynthese (1)

This thesis is devoted to the study of computational complexity theory, a branch of theoretical computer science. Computational complexity theory investigates the inherent difficulty in designing efficient algorithms for computational problems. By doing so, it analyses the scalability of computational problems and algorithms and places practical limits on what computers can actually accomplish. Computational problems are categorised into complexity classes. Among the most important complexity classes are the class NP and the subclass of NP-complete problems, which comprises many important optimisation problems in the field of operations research. Moreover, with the P-NP-problem, the class NP represents the most important unsolved question in computer science. The first part of this thesis is devoted to the study of NP-complete-, and more generally, NP-hard problems. It aims at improving our understanding of this important complexity class by systematically studying how altering NP-hard sets affects their NP-hardness. This research is related to longstanding open questions concerning the complexity of unions of disjoint NP-complete sets, and the existence of sparse NP-hard sets. The second part of the thesis is also dedicated to complexity classes but takes a different perspective: In a sense, after investigating the interior of complexity classes in the first part, the focus shifts to the description of complexity classes and thereby to the exterior in the second part. It deals with the description of complexity classes through leaf languages, a uniform framework which allows us to characterise a great variety of important complexity classes. The known concepts are complemented by a new leaf-language model. To a certain extent, this new approach combines the advantages of the known models. The presented results give evidence that the connection between the theory of formal languages and computational complexity theory might be closer than formerly known.

Overlay networks establish logical connections between users on top of the physical network. While randomly connected overlay networks provide only a best effort service, a new generation of structured overlay systems based on Distributed Hash Tables (DHTs) was proposed by the research community. However, there is still a lack of understanding the performance of such DHTs. Additionally, those architectures are highly distributed and therefore appear as a black box to the operator. Yet an operator does not want to lose control over his system and needs to be able to continuously observe and examine its current state at runtime. This work addresses both problems and shows how the solutions can be combined into a more self-organizing overlay concept. At first, we evaluate the performance of structured overlay networks under different aspects and thereby illuminate in how far such architectures are able to support carrier-grade applications. Secondly, to enable operators to monitor and understand their deployed system in more detail, we introduce both active as well as passive methods to gather information about the current state of the overlay network.

Performance Evaluation of Efficient Resource Management Concepts for Next Generation IP Networks
(2007)

Next generation networks (NGNs) must integrate the services of current circuit-switched telephone networks and packet-switched data networks. This convergence towards a unified communication infrastructure necessitates from the high capital expenditures (CAPEX) and operational expenditures (OPEX) due to the coexistence of separate networks for voice and data. In the end, NGNs must offer the same services as these legacy networks and, therefore, they must provide a low-cost packet-switched solution with real-time transport capabilities for telephony and multimedia applications. In addition, NGNs must be fault-tolerant to guarantee user satisfaction and to support business-critical processes also in case of network failures. A key technology for the operation of NGNs is the Internet Protocol (IP) which evolved to a common and well accepted standard for networking in the Internet during the last 25 years. There are two basically different approaches to achieve QoS in IP networks. With capacity overprovisioning (CO), an IP network is equipped with sufficient bandwidth such that network congestion becomes very unlikely and QoS is maintained most of the time. The second option to achieve QoS in IP networks is admission control (AC). AC represents a network-inherent intelligence that admits real-time traffic flows to a single link or an entire network only if enough resources are available such that the requirements on packet loss and delay can be met. Otherwise, the request of a new flow is blocked. This work focuses on resource management and control mechanisms for NGNs, in particular on AC and associated bandwidth allocation methods. The first contribution consists of a new link-oriented AC method called experience-based admission control (EBAC) which is a hybrid approach dealing with the problems inherent to conventional AC mechanisms like parameter-based or measurement-based AC (PBAC/MBAC). PBAC provides good QoS but suffers from poor resource utilization and, vice versa, MBAC uses resources efficiently but is susceptible to QoS violations. Hence, EBAC aims at increasing the resource efficiency while maintaining the QoS which increases the revenues of ISPs and postpones their CAPEX for infrastructure upgrades. To show the advantages of EBAC, we first review today’s AC approaches and then develop the concept of EBAC. EBAC is a simple mechanism that safely overbooks the capacity of a single link to increase its resource utilization. We evaluate the performance of EBAC by its simulation under various traffic conditions. The second contribution concerns dynamic resource allocation in transport networks which implement a specific network admission control (NAC) architecture. In general, the performance of different NAC systems may be evaluated by conventional methods such as call blocking analysis which has often been applied in the context of multi-service asynchronous transfer mode (ATM) networks. However, to yield more practical results than abstract blocking probabilities, we propose a new method to compare different AC approaches by their respective bandwidth requirements. To present our new method for comparing different AC systems, we first give an overview of network resource management (NRM) in general. Then we present the concept of adaptive bandwidth allocation (ABA) in capacity tunnels and illustrate the analytical performance evaluation framework to compare different AC systems by their capacity requirements. Different network characteristics influence the performance of ABA. Therefore, the impact of various traffic demand models and tunnel implementations, and the influence of resilience requirements is investigated. In conclusion, the resources in NGNs must be exclusively dedicated to admitted traffic to guarantee QoS. For that purpose, robust and efficient concepts for NRM are required to control the requested bandwidth with regard to the available transmission capacity. Sophisticated AC will be a key function for NRM in NGNs and, therefore, efficient resource management concepts like experience-based admission control and adaptive bandwidth allocation for admission-controlled capacity tunnels, as presented in this work are appealing for NGN solutions.

Data mining has proved its significance in various domains and applications. As an important subfield of the general data mining task, subgroup mining can be used, e.g., for marketing purposes in business domains, or for quality profiling and analysis in medical domains. The goal is to efficiently discover novel, potentially useful and ultimately interesting knowledge. However, in real-world situations these requirements often cannot be fulfilled, e.g., if the applied methods do not scale for large data sets, if too many results are presented to the user, or if many of the discovered patterns are already known to the user. This thesis proposes a combination of several techniques in order to cope with the sketched problems: We discuss automatic methods, including heuristic and exhaustive approaches, and especially present the novel SD-Map algorithm for exhaustive subgroup discovery that is fast and effective. For an interactive approach we describe techniques for subgroup introspection and analysis, and we present advanced visualization methods, e.g., the zoomtable that directly shows the most important parameters of a subgroup and that can be used for optimization and exploration. We also describe various visualizations for subgroup comparison and evaluation in order to support the user during these essential steps. Furthermore, we propose to include possibly available background knowledge that is easy to formalize into the mining process. We can utilize the knowledge in many ways: To focus the search process, to restrict the search space, and ultimately to increase the efficiency of the discovery method. We especially present background knowledge to be applied for filtering the elements of the problem domain, for constructing abstractions, for aggregating values of attributes, and for the post-processing of the discovered set of patterns. Finally, the techniques are combined into a knowledge-intensive process supporting both automatic and interactive methods for subgroup mining. The practical significance of the proposed approach strongly depends on the available tools. We introduce the VIKAMINE system as a highly-integrated environment for knowledge-intensive active subgroup mining. Also, we present an evaluation consisting of two parts: With respect to objective evaluation criteria, i.e., comparing the efficiency and the effectiveness of the subgroup discovery methods, we provide an experimental evaluation using generated data. For that task we present a novel data generator that allows a simple and intuitive specification of the data characteristics. The results of the experimental evaluation indicate that the novel SD-Map method outperforms the other described algorithms using data sets similar to the intended application concerning the efficiency, and also with respect to precision and recall for the heuristic methods. Subjective evaluation criteria include the user acceptance, the benefit of the approach, and the interestingness of the results. We present five case studies utilizing the presented techniques: The approach has been successfully implemented in medical and technical applications using real-world data sets. The method was very well accepted by the users that were able to discover novel, useful, and interesting knowledge.

Diagnostic Case Based Training Systems (D-CBT) provide learners with a means to learn and exercise knowledge in a realistic context. In medical education, D-CBT Systems present virtual patients to the learners who are asked to examine, diagnose and state therapies for these patients. Due a number of conflicting and changing requirements, e.g. time for learning, authoring effort, several systems were developed so far. These systems range from simple, easy-to-use presentation systems to highly complex knowledge based systems supporting explorative learning. This thesis presents an approach and tools to create D-CBT systems from existing sources (documents, e.g. dismissal records) using existing tools (word processors): Authors annotate and extend the documents to model the knowledge. A scalable knowledge representation is able to capture the content on multiple levels, from simple to highly structured knowledge. Thus, authoring of D-CBT systems requires less prerequisites and pre-knowledge and is faster than approaches using specialized authoring environments. Also, authors can iteratively add and structure more knowledge to adapt training cases to their learners needs. The theses also discusses the application of the same approach to other domains, especially to knowledge acquisition for the Semantic Web.

We use algebraic closures and structures which are derived from these in complexity theory. We classify problems with Boolean circuits and Boolean constraints according to their complexity. We transfer algebraic structures to structural complexity. We use the generation problem to classify important complexity classes.

Wireless communication is nothing new. The first data transmissions based on electromagnetic waves have been successfully performed at the end of the 19th century. However, it took almost another century until the technology was ripe for mass market. The first mobile communication systems based on the transmission of digital data were introduced in the late 1980s. Within just a couple of years they have caused a revolution in the way people communicate. The number of cellular phones started to outnumber the fixed telephone lines in many countries and is still rising. New technologies in 3G systems, such as UMTS, allow higher data rates and support various kinds of multimedia services. Nevertheless, the end of the road in wireless communication is far from being reached. In the near future, the Internet and cellular phone systems are expected to be integrated to a new form of wireless system. Bandwidth requirements for a rich set of wireless services, e.g.\ video telephony, video streaming, online gaming, will be easily met. The transmission of voice data will just be another IP based service. On the other hand, building such a system is by far not an easy task. The problems in the development of the UMTS system showed the high complexity of wireless systems with support for bandwidth-hungry, IP-based services. But the technological challenges are just one difficulty. Telecommunication systems are planned on a world-wide basis, such that standard bodies, governments, institutions, hardware vendors, and service providers have to find agreements and compromises on a number of different topics. In this work, we provide the reader with a discussion of many of the topics involved in the planning of a Wireless LAN system that is capable of being integrated into the 4th generation mobile networks (4G) that is being discussed nowadays. Therefore, it has to be able to cope with interactive voice and video traffic while still offering high data rates for best effort traffic. Let us assume a scenario where a huge office complex is completely covered with Wireless LAN access points. Different antenna systems are applied in order to reduce the number of access points that are needed on the one hand, while optimizing the coverage on the other. No additional infrastructure is implemented. Our goal is to evaluate whether the Wireless LAN technology is capable of dealing with the various demands of such a scenario. First, each single access point has to be capable of supporting best-effort and Quality of Service (QoS) demanding applications simultaneously. The IT infrastructure in our scenario consists solely of Wireless LAN, such that it has to allow users surfing the Web, while others are involved in voice calls or video conferences. Then, there is the problem of overlapping cells. Users attached to one access point produce interference for others. However, the QoS support has to be maintained, which is not an easy task. Finally, there are nomadic users, which roam from one Wireless LAN cell to another even during a voice call. There are mechanisms in the standard that allow for mobility, but their capabilities for QoS support are yet to be studied. This shows the large number of unresolved issues when it comes to Wireless LAN in the context of 4G networks. In this work we want to tackle some of the problems.

In the last years, visual methods have been introduced in industrial software production and teaching of software engineering. In particular, the international standardization of a graphical software engineering language, the Unified Modeling Language (UML) was a reason for this tendency. Unfortunately, various problems exist in concrete realizations of tools, e.g. due to a missing compliance to the standard. One problem is the automatic layout, which is required for a consistent automatic software design. The thesis derives reasons and criteria for an automatic layout method, which produces drawings of UML class diagrams according to the UML specification and issues of human computer interaction, e.g. readability. A unique set of aesthetic criteria is combined from four different disciplines involved in this topic. Based on these aethetic rules, a hierarchical layout algorithm is developed, analyzed, measured by specialized measuring techniques and compared to related work. Then, the realization of the algorithm as a Java framework is given as an architectural description. Finally, adaptions to anticipated future changes of the UML, improvements of the framework and example drawings of the implementation are given.

This work is subdivided into two main areas: resilient admission control and resilient routing. The work gives an overview of the state of the art of quality of service mechanisms in communication networks and proposes a categorization of admission control (AC) methods. These approaches are investigated regarding performance, more precisely, regarding the potential resource utilization by dimensioning the capacity for a network with a given topology, traffic matrix, and a required flow blocking probability. In case of a failure, the affected traffic is rerouted over backup paths which increases the traffic rate on the respective links. To guarantee the effectiveness of admission control also in failure scenarios, the increased traffic rate must be taken into account for capacity dimensioning and leads to resilient AC. Capacity dimensioning is not feasible for existing networks with already given link capacities. For the application of resilient NAC in this case, the size of distributed AC budgets must be adapted according to the traffic matrix in such a way that the maximum blocking probability for all flows is minimized and that the capacity of all links is not exceeded by the admissible traffic rate in any failure scenario. Several algorithms for the solution of that problem are presented and compared regarding their efficiency and fairness. A prototype for resilient AC was implemented in the laboratories of Siemens AG in Munich within the scope of the project KING. Resilience requires additional capacity on the backup paths for failure scenarios. The amount of this backup capacity depends on the routing and can be minimized by routing optimization. New protection switching mechanisms are presented that deviate the traffic quickly around outage locations. They are simple and can be implemented, e.g, by MPLS technology. The Self-Protecting Multi-Path (SPM) is a multi-path consisting of disjoint partial paths. The traffic is distributed over all faultless partial paths according to an optimized load balancing function both in the working case and in failure scenarios. Performance studies show that the network topology and the traffic matrix also influence the amount of required backup capacity significantly. The example of the COST-239 network illustrates that conventional shortest path routing may need 50% more capacity than the optimized SPM if all single link and node failures are protected.

Nowadays, robotics plays an important role in increasing fields of application. There exist many environments or situations where mobile robots instead of human beings are used, since the tasks are too hazardous, uncomfortable, repetitive, or costly for humans to perform. The autonomy and the mobility of the robot are often essential for a good solution of these problems. Thus, such a robot should at least be able to answer the question "Where am I?". This thesis investigates the problem of self-localizing a robot in an indoor environment using range measurements. That is, a robot equipped with a range sensor wakes up inside a building and has to determine its position using only its sensor data and a map of its environment. We examine this problem from an idealizing point of view (reducing it into a pure geometric one) and further investigate a method of Guibas, Motwani, and Raghavan from the field of computational geometry to solving it. Here, so-called visibility skeletons, which can be seen as coarsened representations of visibility polygons, play a decisive role. In the major part of this thesis we analyze the structures and the occurring complexities in the framework of this scheme. It turns out that the main source of complication are so-called overlapping embeddings of skeletons into the map polygon, for which we derive some restrictive visibility constraints. Based on these results we are able to improve one of the occurring complexity bounds in the sense that we can formulate it with respect to the number of reflex vertices instead of the total number of map vertices. This also affects the worst-case bound on the preprocessing complexity of the method. The second part of this thesis compares the previous idealizing assumptions with the properties of real-world environments and discusses the occurring problems. In order to circumvent these problems, we use the concept of distance functions, which model the resemblance between the sensor data and the map, and appropriately adapt the above method to the needs of realistic scenarios. In particular, we introduce a distance function, namely the polar coordinate metric, which seems to be well suited to the localization problem. Finally, we present the RoLoPro software where most of the discussed algorithms are implemented (including the polar coordinate metric).

Network planning has come to great importance during the past decades. Today's telecommunication, traffic systems, and logistics would not have been evolved to the current state without careful analysis of the underlying network problems and precise implementation of the results obtained from those examinations. Graphs with node and arc attributes are a very useful tool to model realistic applications, while on the other hand they are well understood in theory. We investigate network design problems which are motivated particularly from applications in communication networks and logistics. Those problems include the search for homogeneous subgraphs in edge labeled graphs where either the total number of labels or the reload cost are subject to optimize. Further, we investigate some variants of the dial a ride problem. On the other hand, we use node and edge upgrade models to deal with the fact that in many cases one prefers to change existing networks rather than implementing a newly computed solution from scratch. We investigate the construction of bottleneck constrained forests under a node upgrade model, as well as several flow cost problems under a edge based upgrade model. All problems are examined within a framework of multi-criteria optimization. Many of the problems can be shown to be NP-hard, with the consequence that, under the widely accepted assumption that P is not equal to NP, there cannot exist efficient algorithms for solving the problems. This motivates the development of approximation algorithms which compute near-optimal solutions with provable performance guarantee in polynomial time.

The thesis looks at the question asking for the computability of the dot-depth of star-free regular languages. Here one has to determine for a given star-free regular language the minimal number of alternations between concatenation on one hand, and intersection, union, complement on the other hand. This question was first raised in 1971 (Brzozowski/Cohen) and besides the extended star-heights problem usually refered to as one of the most difficult open questions on regular languages. The dot-depth problem can be captured formally by hierarchies of classes of star-free regular languages B(0), B(1/2), B(1), B(3/2),... and L(0), L(1/2), L(1), L(3/2),.... which are defined via alternating the closure under concatenation and Boolean operations, beginning with single alphabet letters. Now the question of dot-depth is the question whether these hierarchy classes have decidable membership problems. The thesis makes progress on this question using the so-called forbidden pattern approach: Classes of regular languages are characterized in terms of patterns in finite automata (subgraphs in the transition graph) that are not allowed. Such a characterization immediately implies the decidability of the respective class, since the absence of a certain pattern in a given automaton can be effectively verified. Before this work, the decidability of B(0), B(1/2), B(1) and L(0), L(1/2), L(1), L(3/2) were known. Here a detailed study of these classes with help of forbidden patterns is given which leads to new insights into their inner structure. Furthermore, the decidability of B(3/2) is proven. Based on these results a theory of pattern iteration is developed which leads to the introduction of two new hierarchies of star-free regular languages. These hierarchies are decidable on one hand, on the other hand they are in close connection to the classes B(n) and L(n). It remains an open question here whether they may in fact coincide. Some evidence is given in favour of this conjecture which opens a new way to attack the dot-depth problem. Moreover, it is shown that the class L(5/2) is decidable in the restricted case of a two-letter alphabet.

Complexity and Partitions
(2001)

Computational complexity theory usually investigates the complexity of sets, i.e., the complexity of partitions into two parts. But often it is more appropriate to represent natural problems by partitions into more than two parts. A particularly interesting class of such problems consists of classification problems for relations. For instance, a binary relation R typically defines a partitioning of the set of all pairs (x,y) into four parts, classifiable according to the cases where R(x,y) and R(y,x) hold, only R(x,y) or only R(y,x) holds or even neither R(x,y) nor R(y,x) is true. By means of concrete classification problems such as Graph Embedding or Entailment (for propositional logic), this thesis systematically develops tools, in shape of the boolean hierarchy of NP-partitions and its refinements, for the qualitative analysis of the complexity of partitions generated by NP-relations. The Boolean hierarchy of NP-partitions is introduced as a generalization of the well-known and well-studied Boolean hierarchy (of sets) over NP. Whereas the latter hierarchy has a very simple structure, the situation is much more complicated for the case of partitions into at least three parts. To get an idea of this hierarchy, alternative descriptions of the partition classes are given in terms of finite, labeled lattices. Based on these characterizations the Embedding Conjecture is established providing the complete information on the structure of the hierarchy. This conjecture is supported by several results. A natural extension of the Boolean hierarchy of NP-partitions emerges from the lattice-characterization of its classes by considering partition classes generated by finite, labeled posets. It turns out that all significant ideas translate from the case of lattices. The induced refined Boolean hierarchy of NP-partitions enables us more accuratly capturing the complexity of certain relations (such as Graph Embedding) and a description of projectively closed partition classes.

Starfree regular languages can be build up from alphabet letters by using only Boolean operations and concatenation. The complexity of these languages can be measured with the so-called dot-depth. This measure leads to concatenation hierarchies like the dot-depth hierarchy (DDH) and the closely related Straubing-Thérien hierarchy (STH). The question whether the single levels of these hierarchies are decidable is still open and is known as the dot-depth problem. In this thesis we prove/reprove the decidability of some lower levels of both hierarchies. More precisely, we characterize these levels in terms of patterns in finite automata (subgraphs in the transition graph) that are not allowed. Therefore, such characterizations are called forbidden-pattern characterizations. The main results of the thesis are as follows: forbidden-pattern characterization for level 3/2 of the DDH (this implies the decidability of this level) decidability of the Boolean hierarchy over level 1/2 of the DDH definition of decidable hierarchies having close relations to the DDH and STH Moreover, we prove/reprove the decidability of the levels 1/2 and 3/2 of both hierarchies in terms of forbidden-pattern characterizations. We show the decidability of the Boolean hierarchies over level 1/2 of the DDH and over level 1/2 of the STH. A technique which uses word extensions plays the central role in the proofs of these results. With this technique it is possible to treat the levels 1/2 and 3/2 of both hierarchies in a uniform way. Furthermore, it can be used to prove the decidability of the mentioned Boolean hierarchies. Among other things we provide a combinatorial tool that allows to partition words of arbitrary length into factors of bounded length such that every second factor u leads to a loop with label u in a given finite automaton.

In the last 40 years, complexity theory has grown to a rich and powerful field in theoretical computer science. The main task of complexity theory is the classification of problems with respect to their consumption of resources (e.g., running time or required memory). To study the computational complexity (i.e., consumption of resources) of problems, similar problems are grouped into so called complexity classes. During the systematic study of numerous problems of practical relevance, no efficient algorithm for a great number of studied problems was found. Moreover, it was unclear whether such algorithms exist. A major breakthrough in this situation was the introduction of the complexity classes P and NP and the identification of hardest problems in NP. These hardest problems of NP are nowadays known as NP-complete problems. One prominent example of an NP-complete problem is the satisfiability problem of propositional formulas (SAT). Here we get a propositional formula as an input and it must be decided whether an assignment for the propositional variables exists, such that this assignment satisfies the given formula. The intensive study of NP led to numerous related classes, e.g., the classes of the polynomial-time hierarchy PH, P, #P, PP, NL, L and #L. During the study of these classes, problems related to propositional formulas were often identified to be complete problems for these classes. Hence some questions arise: Why is SAT so hard to solve? Are there modifications of SAT which are complete for other well-known complexity classes? In the context of these questions a result by E. Post is extremely useful. He identified and characterized all classes of Boolean functions being closed under superposition. It is possible to study problems which are connected to generalized propositional logic by using this result, which was done in this thesis. Hence, many different problems connected to propositional logic were studied and classified with respect to their computational complexity, clearing the borderline between easy and hard problems.