## 004 Datenverarbeitung; Informatik

### Refine

#### Year of publication

#### Document Type

- Doctoral Thesis (73)
- Journal article (35)
- Preprint (18)
- Jahresbericht (5)
- Conference Proceeding (3)
- Master Thesis (3)
- Report (2)
- Other (1)

#### Language

- English (115)
- German (24)
- Multiple languages (1)

#### Keywords

- Leistungsbewertung (12)
- Quran (8)
- Robotik (8)
- Koran (7)
- Mobiler Roboter (7)
- Text Mining (7)
- Autonomer Roboter (6)
- Computer Center University of Wuerzburg (5)
- Jahresbericht (5)
- Komplexitätstheorie (5)

#### Institute

- Institut für Informatik (92)
- Theodor-Boveri-Institut für Biowissenschaften (20)
- Institut für deutsche Philologie (17)
- Rechenzentrum (7)
- Graduate School of Science and Technology (2)
- Institut für Geographie und Geologie (1)
- Institut für Humangenetik (1)
- Institut für Mathematik (1)
- Institut für Pharmazie und Lebensmittelchemie (1)
- Institut für Theoretische Physik und Astrophysik (1)

BACKGROUND: In the face of growing resistance in malaria parasites to drugs, pharmacological combination therapies are important. There is accumulating evidence that methylene blue (MB) is an effective drug against malaria. Here we explore the biological effects of both MB alone and in combination therapy using modeling and experimental data.
RESULTS: We built a model of the central metabolic pathways in P. falciparum. Metabolic flux modes and their changes under MB were calculated by integrating experimental data (RT-PCR data on mRNAs for redox enzymes) as constraints and results from the YANA software package for metabolic pathway calculations. Several different lines of MB attack on Plasmodium redox defense were identified by analysis of the network effects. Next, chloroquine resistance based on pfmdr/and pfcrt transporters, as well as pyrimethamine/sulfadoxine resistance (by mutations in DHF/DHPS), were modeled in silico. Further modeling shows that MB has a favorable synergism on antimalarial network effects with these commonly used antimalarial drugs.
CONCLUSIONS: Theoretical and experimental results support that methylene blue should, because of its resistance-breaking potential, be further tested as a key component in drug combination therapy efforts in holoendemic areas.

Internet applications are becoming more and more flexible to support diverge user demands and network conditions. This is reflected by technical concepts, which provide new adaptation mechanisms to allow fine grained adjustment of the application quality and the corresponding bandwidth requirements. For the case of video streaming, the scalable video codec H.264/SVC allows the flexible adaptation of frame rate, video resolution and image quality with respect to the available network resources. In order to guarantee a good user-perceived quality (Quality of Experience, QoE) it is necessary to adjust and optimize the video quality accurately. But not only have the applications of the current Internet changed. Within network and transport, new technologies evolved during the last years providing a more flexible and efficient usage of data transport and network resources. One of the most promising technologies is Network Virtualization (NV) which is seen as an enabler to overcome the ossification of the Internet stack. It provides means to simultaneously operate multiple logical networks which allow for example application-specific addressing, naming and routing, or their individual resource management. New transport mechanisms like multipath transmission on the network and transport layer aim at an efficient usage of available transport resources. However, the simultaneous transmission of data via heterogeneous transport paths and communication technologies inevitably introduces packet reordering. Additional mechanisms and buffers are required to restore the correct packet order and thus to prevent a disturbance of the data transport. A proper buffer dimensioning as well as the classification of the impact of varying path characteristics like bandwidth and delay require appropriate evaluation methods. Additionally, for a path selection mechanism real time evaluation mechanisms are needed. A better application-network interaction and the corresponding exchange of information enable an efficient adaptation of the application to the network conditions and vice versa. This PhD thesis analyzes a video streaming architecture utilizing multipath transmission and scalable video coding and develops the following optimization possibilities and results: Analysis and dimensioning methods for multipath transmission, quantification of the adaptation possibilities to the current network conditions with respect to the QoE for H.264/SVC, and evaluation and optimization of a future video streaming architecture, which allows a better interaction of application and network.

This work is composed of three main parts: remote control of mobile systems via Internet, ad-hoc networks of mobile robots, and remote control of mobile robots via 3G telecommunication technologies. The first part gives a detailed state of the art and a discussion of the problems to be solved in order to teleoperate mobile robots via the Internet. The focus of the application to be realized is set on a distributed tele-laboratory with remote experiments on mobile robots which can be accessed world-wide via the Internet. Therefore, analyses of the communication link are used in order to realize a robust system. The developed and implemented architecture of this distributed tele-laboratory allows for a smooth access also with a variable or low link quality. The second part covers the application of ad-hoc networks for mobile robots. The networking of mobile robots via mobile ad-hoc networks is a very promising approach to realize integrated telematic systems without relying on preexisting communication infrastructure. Relevant civilian application scenarios are for example in the area of search and rescue operations where first responders are supported by multi-robot systems. Here, mobile robots, humans, and also existing stationary sensors can be connected very fast and efficient. Therefore, this work investigates and analyses the performance of different ad-hoc routing protocols for IEEE 802.11 based wireless networks in relevant scenarios. The analysis of the different protocols allows for an optimization of the parameter settings in order to use these ad-hoc routing protocols for mobile robot teleoperation. Also guidelines for the realization of such telematics systems are given. Also traffic shaping mechanisms of application layer are presented which allow for a more efficient use of the communication link. An additional application scenario, the integration of a small size helicopter into an IP based ad-hoc network, is presented. The teleoperation of mobile robots via 3G telecommunication technologies is addressed in the third part of this work. The high availability, high mobility, and the high bandwidth provide a very interesting opportunity to realize scenarios for the teleoperation of mobile robots or industrial remote maintenance. This work analyses important parameters of the UMTS communication link and investigates also the characteristics for different data streams. These analyses are used to give guidelines which are necessary for the realization of or industrial remote maintenance or mobile robot teleoperation scenarios. All the results and guidelines for the design of telematic systems in this work were derived from analyses and experiments with real hardware.

This article is about a measurement analysis based approach to help software practitioners in managing the additional level complexities and variabilities in software product line applications. The architecture of the proposed approach i.e. ZAC is designed and implemented to perform preprocessesed source code analysis, calculate traditional and product line metrics and visualize results in two and three dimensional diagrams. Experiments using real time data sets are performed which concluded with the results that the ZAC can be very helpful for the software practitioners in understanding the overall structure and complexity of product line applications. Moreover the obtained results prove strong positive correlation between calculated traditional and product line measures.

Gegenstand der Arbeit stellt eine erstmalig unternommene, architekturübergreifende Studie über feldprogrammierbare Logikbausteine zur Implementierung synchroner Schaltkreise dar. Zunächst wird ein Modell für allgemeine feldprogrammiebare Architekturen basierend auf periodischen Graphen definiert. Schließlich werden Bewertungsmaße für Architekturen und Schaltkreislayouts angegeben zur Charakterisierung struktureller Eigenschaften hinsichtlich des Verhaltens in Chipflächenverbrauch und Signalverzögerung. Ferner wird ein generisches Layout-Werkzeug entwickelt, das für beliebige Architekturen und Schaltkreise Implementierungen berechnen und bewerten kann. Abschließend werden neun ressourcenminimalistische Architekturen mit Maschen- und mit Inselstruktur einander gegenübergestellt.

In this paper we study connectivity augmentation problems. Given a connected graph G with some desirable property, we want to make G 2-vertex connected (or 2-edge connected) by adding edges such that the resulting graph keeps the property. The aim is to add as few edges as possible. The property that we consider is planarity, both in an abstract graph-theoretic and in a geometric setting, where vertices correspond to points in the plane and edges to straight-line segments.
We show that it is NP-hard to nd a minimum-cardinality augmentation that makes a planar graph 2-edge connected. For making a planar graph 2-vertex connected this was known. We further show that both problems are hard in the geometric setting, even when restricted to trees. The problems remain hard for higher degrees of connectivity. On the other hand we give polynomial-time algorithms for the special case of convex geometric graphs.
We also study the following related problem. Given a planar (plane geometric) graph G, two vertices s and t of G, and an integer c, how many edges have to be added to G such that G is still planar (plane geometric) and contains c edge- (or vertex-) disjoint s{t paths? For the planar case we give a linear-time algorithm for c = 2. For the plane geometric case we give optimal worst-case bounds for c = 2; for c = 3 we characterize the cases that have a solution.

Der Betrieb von Satelliten wird sich in Zukunft gravierend ändern. Die bisher ausgeübte konventionelle Vorgehensweise, bei der die Planung der vom Satelliten auszuführenden Aktivitäten sowie die Kontrolle hierüber ausschließlich vom Boden aus erfolgen, stößt bei heutigen Anwendungen an ihre Grenzen. Im schlimmsten Fall verhindert dieser Umstand sogar die Erschließung bisher ungenutzter Möglichkeiten. Der Gewinn eines Satelliten, sei es in Form wissenschaftlicher Daten oder der Vermarktung satellitengestützter Dienste, wird daher nicht optimal ausgeschöpft.
Die Ursache für dieses Problem lässt sich im Grunde auf eine ausschlaggebende Tatsache zurückführen: Konventionelle Satelliten können ihr Verhalten, d.h. die Folge ihrer Tätigkeiten, nicht eigenständig anpassen. Stattdessen erstellt das Bedienpersonal am Boden - vor allem die Operatoren - mit Hilfe von Planungssoftware feste Ablaufpläne, die dann in Form von Kommandosequenzen von den Bodenstationen aus an die jeweiligen Satelliten hochgeladen werden. Dort werden die Befehle lediglich überprüft, interpretiert und strikt ausgeführt. Die Abarbeitung erfolgt linear. Situationsbedingte Änderungen, wie sie vergleichsweise bei der Codeausführung von Softwareprogrammen durch Kontrollkonstrukte, zum Beispiel Schleifen und Verzweigungen, üblich sind, sind typischerweise nicht vorgesehen. Der Operator ist daher die einzige Instanz, die das Verhalten des Satelliten mittels Kommandierung, per Upload, beeinflussen kann, und auch nur dann, wenn ein direkter Funkkontakt zwischen Satellit und Bodenstation besteht. Die dadurch möglichen Reaktionszeiten des Satelliten liegen bestenfalls bei einigen Sekunden, falls er sich im Wirkungsbereich der Bodenstation befindet. Außerhalb des Kontaktfensters kann sich die Zeitschranke, gegeben durch den Orbit und die aktuelle Position des Satelliten, von einigen Minuten bis hin zu einigen Stunden erstrecken. Die Signallaufzeiten der Funkübertragung verlängern die Reaktionszeiten um weitere Sekunden im erdnahen Bereich. Im interplanetaren Raum erstrecken sich die Zeitspannen aufgrund der immensen Entfernungen sogar auf mehrere Minuten. Dadurch bedingt liegt die derzeit technologisch mögliche, bodengestützte, Reaktionszeit von Satelliten bestenfalls im Bereich von einigen Sekunden.
Diese Einschränkung stellt ein schweres Hindernis für neuartige Satellitenmissionen, bei denen insbesondere nichtdeterministische und kurzzeitige Phänomene (z.B. Blitze und Meteoreintritte in die Erdatmosphäre) Gegenstand der Beobachtungen sind, dar. Die langen Reaktionszeiten des konventionellen Satellitenbetriebs verhindern die Realisierung solcher Missionen, da die verzögerte Reaktion erst erfolgt, nachdem das zu beobachtende Ereignis bereits abgeschlossen ist.
Die vorliegende Dissertation zeigt eine Möglichkeit, das durch die langen Reaktionszeiten entstandene Problem zu lösen, auf. Im Zentrum des Lösungsansatzes steht dabei die Autonomie. Im Wesentlichen geht es dabei darum, den Satelliten mit der Fähigkeit auszustatten, sein Verhalten, d.h. die Folge seiner Tätigkeiten, eigenständig zu bestimmen bzw. zu ändern. Dadurch wird die direkte Abhängigkeit des Satelliten vom Operator bei Reaktionen aufgehoben. Im Grunde wird der Satellit in die Lage versetzt, sich selbst zu kommandieren.
Die Idee der Autonomie wurde im Rahmen der zugrunde liegenden Forschungsarbeiten umgesetzt. Das Ergebnis ist ein autonomes Planungssystem. Dabei handelt es sich um ein Softwaresystem, mit dem sich autonomes Verhalten im Satelliten realisieren lässt. Es kann an unterschiedliche Satellitenmissionen angepasst werden. Ferner deckt es verschiedene Aspekte des autonomen Satellitenbetriebs, angefangen bei der generellen Entscheidungsfindung der Tätigkeiten, über die zeitliche Ablaufplanung unter Einbeziehung von Randbedingungen (z.B. Ressourcen) bis hin zur eigentlichen Ausführung, d.h. Kommandierung, ab. Das Planungssystem kommt als Anwendung in ASAP, einer autonomen Sensorplattform, zum Einsatz. Es ist ein optisches System und dient der Detektion von kurzzeitigen Phänomenen und Ereignissen in der Erdatmosphäre.
Die Forschungsarbeiten an dem autonomen Planungssystem, an ASAP sowie an anderen zu diesen in Bezug stehenden Systemen wurden an der Professur für Raumfahrttechnik des Lehrstuhls Informatik VIII der Julius-Maximilians-Universität Würzburg durchgeführt.

Network planning has come to great importance during the past decades. Today's telecommunication, traffic systems, and logistics would not have been evolved to the current state without careful analysis of the underlying network problems and precise implementation of the results obtained from those examinations. Graphs with node and arc attributes are a very useful tool to model realistic applications, while on the other hand they are well understood in theory. We investigate network design problems which are motivated particularly from applications in communication networks and logistics. Those problems include the search for homogeneous subgraphs in edge labeled graphs where either the total number of labels or the reload cost are subject to optimize. Further, we investigate some variants of the dial a ride problem. On the other hand, we use node and edge upgrade models to deal with the fact that in many cases one prefers to change existing networks rather than implementing a newly computed solution from scratch. We investigate the construction of bottleneck constrained forests under a node upgrade model, as well as several flow cost problems under a edge based upgrade model. All problems are examined within a framework of multi-criteria optimization. Many of the problems can be shown to be NP-hard, with the consequence that, under the widely accepted assumption that P is not equal to NP, there cannot exist efficient algorithms for solving the problems. This motivates the development of approximation algorithms which compute near-optimal solutions with provable performance guarantee in polynomial time.

In the course of the growth of the Internet and due to increasing availability of data, over the last two decades, the field of network science has established itself as an own area of research. With quantitative scientists from computer science, mathematics, and physics working on datasets from biology, economics, sociology, political sciences, and many others, network science serves as a paradigm for interdisciplinary research.
One of the major goals in network science is to unravel the relationship between topological graph structure and a network’s function. As evidence suggests, systems from the same fields, i.e. with similar function, tend to exhibit similar structure. However, it is still vague whether a similar graph structure automatically implies likewise function. This dissertation aims at helping to bridge this gap, while particularly focusing on the role of triadic structures.
After a general introduction to the main concepts of network science, existing work devoted to the relevance of triadic substructures is reviewed. A major challenge in modeling triadic structure is the fact that not all three-node subgraphs can be specified independently
of each other, as pairs of nodes may participate in multiple of those triadic subgraphs.
In order to overcome this obstacle, we suggest a novel class of generative network models based on so called Steiner triple systems. The latter are partitions of a graph’s vertices into pair-disjoint triples (Steiner triples). Thus, the configurations on Steiner triples can be specified independently of each other without overdetermining the network’s link
structure.
Subsequently, we investigate the most basic realization of this new class of models. We call it the triadic random graph model (TRGM). The TRGM is parametrized by a probability distribution over all possible triadic subgraph patterns. In order to generate a network instantiation of the model, for all Steiner triples in the system, a pattern is drawn from the distribution and adjusted randomly on the Steiner triple. We calculate the degree distribution of the TRGM analytically and find it to be similar to a Poissonian distribution. Furthermore, it is shown that TRGMs possess non-trivial triadic structure. We discover inevitable correlations in the abundance of certain triadic subgraph
patterns which should be taken into account when attributing functional relevance to particular motifs – patterns which occur significantly more frequently than expected at random. Beyond, the strong impact of the probability distributions on the Steiner triples on the occurrence of triadic subgraphs over the whole network is demonstrated. This interdependence allows us to design ensembles of networks with predefined triadic substructure. Hence, TRGMs help to overcome the lack of generative models needed for assessing the relevance of triadic structure.
We further investigate whether motifs occur homogeneously or heterogeneously distributed over a graph. Therefore, we study triadic subgraph structures in each node’s neighborhood individually. In order to quantitatively measure structure from an individual node’s perspective, we introduce an algorithm for node-specific pattern mining for both directed unsigned, and undirected signed networks. Analyzing real-world datasets, we find that there are networks in which motifs are distributed highly heterogeneously, bound to the proximity of only very few nodes. Moreover, we observe indication for the potential sensitivity of biological systems to a targeted removal of these critical vertices. In addition, we study whole graphs with respect to the homogeneity and homophily of their node-specific triadic structure. The former describes the similarity of subgraph distributions in the neighborhoods of individual vertices. The latter quantifies whether connected vertices
are structurally more similar than non-connected ones. We discover these features to be characteristic for the networks’ origins. Moreover, clustering the vertices of graphs regarding their triadic structure, we investigate structural groups in the neural network of C. elegans, the international airport-connection network, and the global network of diplomatic sentiments between countries. For the latter we find evidence for the instability of triangles considered socially unbalanced according to sociological theories.
Finally, we utilize our TRGM to explore ensembles of networks with similar triadic substructure in terms of the evolution of dynamical processes acting on their nodes. Focusing on oscillators, coupled along the graphs’ edges, we observe that certain triad motifs impose a clear signature on the systems’ dynamics, even when embedded in a larger
network structure.

Background
The actin cytoskeleton is a hallmark of eukaryotic cells. Its regulation as well as its interaction with other proteins is carefully orchestrated by actin interaction domains. One of the key players is the WH2 motif, which enables binding to actin monomers and filaments and is involved in the regulation of actin nucleation. Contrasting conserved domains, the identification of this motif in protein sequences is challenging, as it is short and poorly conserved.
Findings
To identify divergent members, we combined Hidden-Markov-Model (HMM) to HMM alignments with orthology predictions. Thereby, we identified nearly 500 proteins containing so far not annotated WH2 motifs. This included shootin-1, an actin binding protein involved in neuron polarization. Among others, WH2 motifs of ‘proximal to raf’ (ptr)-orthologs, which are described in the literature, but not annotated in genome databases, were identified.
Conclusion
In summary, we increased the number of WH2 motif containing proteins substantially. This identification of candidate regions for actin interaction could steer their experimental characterization. Furthermore, the approach outlined here can easily be adapted to the identification of divergent members of further domain families.

The IronChip evaluation package: a package of perl modules for robust analysis of custom microarrays
(2010)

Background: Gene expression studies greatly contribute to our understanding of complex relationships in gene regulatory networks. However, the complexity of array design, production and manipulations are limiting factors, affecting data quality. The use of customized DNA microarrays improves overall data quality in many situations, however, only if for these specifically designed microarrays analysis tools are available. Results: The IronChip Evaluation Package (ICEP) is a collection of Perl utilities and an easy to use data evaluation pipeline for the analysis of microarray data with a focus on data quality of custom-designed microarrays. The package has been developed for the statistical and bioinformatical analysis of the custom cDNA microarray IronChip but can be easily adapted for other cDNA or oligonucleotide-based designed microarray platforms. ICEP uses decision tree-based algorithms to assign quality flags and performs robust analysis based on chip design properties regarding multiple repetitions, ratio cut-off, background and negative controls. Conclusions: ICEP is a stand-alone Windows application to obtain optimal data quality from custom-designed microarrays and is freely available here (see “Additional Files” section) and at: http://www.alice-dsl.net/evgeniy. vainshtein/ICEP/

The ecosystem of the high northern latitudes is affected by the recently changing environmental conditions. The Arctic has undergone a significant climatic change over the last decades. The land coverage is changing and a phenological response to the warming is apparent. Remotely sensed data can assist the monitoring and quantification of these changes. The remote sensing of the Arctic was predominantly carried out by the usage of optical sensors but these encounter problems in the Arctic environment, e.g. the frequent cloud cover or the solar geometry. In contrast, the imaging of Synthetic Aperture Radar is not affected by the cloud cover and the acquisition of radar imagery is independent of the solar illumination. The objective of this work was to explore how polarimetric Synthetic Aperture Radar (PolSAR) data of TerraSAR-X, TanDEM-X, Radarsat-2 and ALOS PALSAR and interferometric-derived digital elevation model data of the TanDEM-X Mission can contribute to collect meaningful information on the actual state of the Arctic Environment. The study was conducted for Canadian sites of the Mackenzie Delta Region and Banks Island and in situ reference data were available for the assessment. The up-to-date analysis of the PolSAR data made the application of the Non-Local Means filtering and of the decomposition of co-polarized data necessary.
The Non-Local Means filter showed a high capability to preserve the image values, to keep the edges and to reduce the speckle. This supported not only the suitability for the interpretation but also for the classification. The classification accuracies of Non-Local Means filtered data were in average +10% higher compared to unfiltered images. The correlation of the co- and quad-polarized decomposition features was high for classes with distinct surface or double bounce scattering and a usage of the co-polarized data is beneficial for regions of natural land coverage and for low vegetation formations with little volume scattering. The evaluation further revealed that the X- and C-Band were most sensitive to the generalized land cover classes. It was found that the X-Band data were sensitive to low vegetation formations with low shrub density, the C-Band data were sensitive to the shrub density and the shrub dominated tundra. In contrast, the L-Band data were less sensitive to the land cover. Among the different dual-polarized data the HH/VV-polarized data were identified to be most meaningful for the characterization and classification, followed by the HH/HV-polarized and the VV/VH-polarized data. The quad-polarized data showed highest sensitivity to the land cover but differences to the co-polarized data were small. The accuracy assessment showed that spectral information was required for accurate land cover classification. The best results were obtained when spectral and radar information was combined. The benefit of including radar data in the classification was up to +15% accuracy and most significant for the classes wetland and sparse vegetated tundra. The best classifications were realized with quad-polarized C-Band and multispectral data and with co-polarized X-Band and multispectral data. The overall accuracy was up to 80% for unsupervised and up to 90% for supervised classifications. The results indicated that the shortwave co-polarized data show promise for the classification of tundra land cover since the polarimetric information is sensitive to low vegetation and the wetlands. Furthermore, co-polarized data provide a higher spatial resolution than the quad-polarized data.
The analysis of the intermediate digital elevation model data of the TanDEM-X showed a high potential for the characterization of the surface morphology. The basic and relative topographic features were shown to be of high relevance for the quantification of the surface morphology and an area-wide application is feasible. In addition, these data were of value for the classification and delineation of landforms. Such classifications will assist the delineation of geomorphological units and have potential to identify locations of actual and future morphologic activity.

This thesis is devoted to the study of computational complexity theory, a branch of theoretical computer science. Computational complexity theory investigates the inherent difficulty in designing efficient algorithms for computational problems. By doing so, it analyses the scalability of computational problems and algorithms and places practical limits on what computers can actually accomplish. Computational problems are categorised into complexity classes. Among the most important complexity classes are the class NP and the subclass of NP-complete problems, which comprises many important optimisation problems in the field of operations research. Moreover, with the P-NP-problem, the class NP represents the most important unsolved question in computer science. The first part of this thesis is devoted to the study of NP-complete-, and more generally, NP-hard problems. It aims at improving our understanding of this important complexity class by systematically studying how altering NP-hard sets affects their NP-hardness. This research is related to longstanding open questions concerning the complexity of unions of disjoint NP-complete sets, and the existence of sparse NP-hard sets. The second part of the thesis is also dedicated to complexity classes but takes a different perspective: In a sense, after investigating the interior of complexity classes in the first part, the focus shifts to the description of complexity classes and thereby to the exterior in the second part. It deals with the description of complexity classes through leaf languages, a uniform framework which allows us to characterise a great variety of important complexity classes. The known concepts are complemented by a new leaf-language model. To a certain extent, this new approach combines the advantages of the known models. The presented results give evidence that the connection between the theory of formal languages and computational complexity theory might be closer than formerly known.

Background
Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of Würzburg.
Methods
A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts.
Results
The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout.
Conclusions
The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports with very high precision and high recall on the majority of documents at the University Hospital of Würzburg. Extracted results populate a clinical data warehouse which supports clinical research.

Parametric weighted finite automata (PWFA) are a multi-dimensional generalization of weighted finite automata. The expressiveness of PWFA contains the expressiveness of weighted finite automata as well as the expressiveness of affine iterated function system. The thesis discusses theory and applications of PWFA. The properties of PWFA definable sets are studied and it is shown that some fractal generator systems can be simulated using PWFA and that various real and complex functions can be represented by PWFA. Furthermore, the decoding of PWFA and the interpretation of PWFA definable sets is discussed.

The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.

Object six Degrees of Freedom (6DOF) pose estimation is a fundamental problem in many practical robotic applications, where the target or an obstacle with a simple or complex shape can move fast in cluttered environments. In this thesis, a 6DOF pose estimation algorithm is developed based on the fused data from a time-of-flight camera and a color camera. The algorithm is divided into two stages, an annealed particle filter based coarse pose estimation stage and a gradient decent based accurate pose optimization stage. In the first stage, each particle is evaluated with sparse representation. In this stage, the large inter-frame motion of the target can be well handled. In the second stage, the range data based conventional Iterative Closest Point is extended by incorporating the target appearance information and used for calculating the accurate pose by refining the coarse estimate from the first stage. For dealing with significant illumination variations during the tracking, spherical harmonic illumination modeling is investigated and integrated into both stages. The robustness and accuracy of the proposed algorithm are demonstrated through experiments on various objects in both indoor and outdoor environments. Moreover, real-time performance can be achieved with graphics processing unit acceleration.

Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single genes classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single genes classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single genes classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single genes sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single genes classifiers for predicting outcome in breast cancer.

In the future Internet, the people-centric communication paradigm will be complemented by a ubiquitous communication among people and devices, or even a communication between devices. This comes along with the need for a more flexible, cheap, widely available Internet access. Two types of wireless networks are considered most appropriate for attaining those goals. While wireless sensor networks (WSNs) enhance the Internet’s reach by providing data about the properties of the environment, wireless mesh networks (WMNs) extend the Internet access possibilities beyond the wired backbone. This monograph contains four chapters which present modeling and optimization methods for WSNs and WMNs. Minimizing energy consumptions is the most important goal of WSN optimization and the literature consequently provides countless energy consumption models. The first part of the monograph studies to what extent the used energy consumption model influences the outcome of analytical WSN optimizations. These considerations enable the second contribution, namely overcoming the problems on the way to a standardized energy-efficient WSN communication stack based on IEEE 802.15.4 and ZigBee. For WMNs both problems are of minor interest whereas the network performance has a higher weight. The third part of the work, therefore, presents algorithms for calculating the max-min fair network throughput in WMNs with multiple link rates and Internet gateway. The last contribution of the monograph investigates the impact of the LRA concept which proposes to systematically assign more robust link rates than actually necessary, thereby allowing to exploit the trade-off between spatial reuse and per-link throughput. A systematical study shows that a network-wide slightly more conservative LRA than necessary increases the throughput of a WMN where max-min fairness is guaranteed. It moreover turns out that LRA is suitable for increasing the performance of a contention-based WMN and is a valuable optimization tool.

We consider competitive location problems where two competing providers place their facilities sequentially and users can decide between the competitors. We assume that both competitors act non-cooperatively and aim at maximizing their own benefits. We investigate the complexity and approximability of such problems on graphs, in particular on simple graph classes such as trees and paths. We also develop fast algorithms for single competitive location problems where each provider places a single facilty. Voting location, in contrast, aims at identifying locations that meet social criteria. The provider wants to satisfy the users (customers) of the facility to be opened. In general, there is no location that is favored by all users. Therefore, a satisfactory compromise has to be found. To this end, criteria arising from voting theory are considered. The solution of the location problem is understood as the winner of a virtual election among the users of the facilities, in which the potential locations play the role of the candidates and the users represent the voters. Competitive and voting location problems turn out to be closely related.