Refine
Has Fulltext
- yes (16)
Is part of the Bibliography
- yes (16)
Year of publication
Document Type
- Doctoral Thesis (14)
- Journal article (2)
Language
- English (16) (remove)
Keywords
- Netzwerk (16) (remove)
In this thesis we study various aspects of chaos synchronization of time-delayed coupled chaotic maps. A network of identical nonlinear units interacting by time-delayed couplings can synchronize to a common chaotic trajectory. Even for large delay times the system can completely synchronize without any time shift. In the first part we study chaotic systems with multiple time delays that range over several orders of magnitude. We show that these time scales emerge in the Lyapunov spectrum: Different parts of the spectrum scale with the different delays. We define various types of chaos depending on the scaling of the maximum exponent. The type of chaos determines the synchronization ability of coupled networks. This is, in particular, relevant for the synchronization properties of networks of networks where time delays within a subnetwork are shorter than the corresponding time delays between the different subnetworks. If the maximum Lyapunov exponent scales with the short intra-network delay, only the elements within a subnetwork can synchronize. If, however, the maximum Lyapunov exponent scales with the long inter-network connection, complete synchronization of all elements is possible. The results are illustrated analytically for Bernoulli maps and numerically for tent maps. In the second part the attractor dimension at the transition to complete chaos synchronization is investigated. In particular, we determine the Kaplan-Yorke dimension from the spectrum of Lyapunov exponents for iterated maps. We argue that the Kaplan-Yorke dimension must be discontinuous at the transition and compare it to the correlation dimension. For a system of Bernoulli maps we indeed find a jump in the correlation dimension. The magnitude of the discontinuity in the Kaplan-Yorke dimension is calculated for networks of Bernoulli units as a function of the network size. Furthermore the scaling of the Kaplan-Yorke dimension as well as of the Kolmogorov entropy with system size and time delay is investigated. Finally, we study the change in the attractor dimension for systems with parameter mismatch. In the third and last part the linear response of synchronized chaotic systems to small external perturbations is studied. The distribution of the distances from the synchronization manifold, i.e., the deviations between two synchronized chaotic units due to external perturbations on the transmitted signal, is used as a measure of the linear response. It is calculated numerically and, for some special cases, analytically. Depending on the model parameters this distribution has power law tails in the region of synchronization leading to diverging moments. The linear response is also quantified by means of the bit error rate of a transmitted binary message which perturbs the synchronized system. The bit error rate is given by an integral over the distribution of distances and is studied numerically for Bernoulli, tent and logistic maps. It displays a complex nonmonotonic behavior in the region of synchronization. For special cases the distribution of distances has a fractal structure leading to a devil's staircase for the bit error rate as a function of coupling strength. The response to small harmonic perturbations shows resonances related to coupling and feedback delay times. A bi-directionally coupled chain of three units can completely filter out the perturbation. Thus the second moment and the bit error rate become zero.
In the course of the growth of the Internet and due to increasing availability of data, over the last two decades, the field of network science has established itself as an own area of research. With quantitative scientists from computer science, mathematics, and physics working on datasets from biology, economics, sociology, political sciences, and many others, network science serves as a paradigm for interdisciplinary research.
One of the major goals in network science is to unravel the relationship between topological graph structure and a network’s function. As evidence suggests, systems from the same fields, i.e. with similar function, tend to exhibit similar structure. However, it is still vague whether a similar graph structure automatically implies likewise function. This dissertation aims at helping to bridge this gap, while particularly focusing on the role of triadic structures.
After a general introduction to the main concepts of network science, existing work devoted to the relevance of triadic substructures is reviewed. A major challenge in modeling triadic structure is the fact that not all three-node subgraphs can be specified independently
of each other, as pairs of nodes may participate in multiple of those triadic subgraphs.
In order to overcome this obstacle, we suggest a novel class of generative network models based on so called Steiner triple systems. The latter are partitions of a graph’s vertices into pair-disjoint triples (Steiner triples). Thus, the configurations on Steiner triples can be specified independently of each other without overdetermining the network’s link
structure.
Subsequently, we investigate the most basic realization of this new class of models. We call it the triadic random graph model (TRGM). The TRGM is parametrized by a probability distribution over all possible triadic subgraph patterns. In order to generate a network instantiation of the model, for all Steiner triples in the system, a pattern is drawn from the distribution and adjusted randomly on the Steiner triple. We calculate the degree distribution of the TRGM analytically and find it to be similar to a Poissonian distribution. Furthermore, it is shown that TRGMs possess non-trivial triadic structure. We discover inevitable correlations in the abundance of certain triadic subgraph
patterns which should be taken into account when attributing functional relevance to particular motifs – patterns which occur significantly more frequently than expected at random. Beyond, the strong impact of the probability distributions on the Steiner triples on the occurrence of triadic subgraphs over the whole network is demonstrated. This interdependence allows us to design ensembles of networks with predefined triadic substructure. Hence, TRGMs help to overcome the lack of generative models needed for assessing the relevance of triadic structure.
We further investigate whether motifs occur homogeneously or heterogeneously distributed over a graph. Therefore, we study triadic subgraph structures in each node’s neighborhood individually. In order to quantitatively measure structure from an individual node’s perspective, we introduce an algorithm for node-specific pattern mining for both directed unsigned, and undirected signed networks. Analyzing real-world datasets, we find that there are networks in which motifs are distributed highly heterogeneously, bound to the proximity of only very few nodes. Moreover, we observe indication for the potential sensitivity of biological systems to a targeted removal of these critical vertices. In addition, we study whole graphs with respect to the homogeneity and homophily of their node-specific triadic structure. The former describes the similarity of subgraph distributions in the neighborhoods of individual vertices. The latter quantifies whether connected vertices
are structurally more similar than non-connected ones. We discover these features to be characteristic for the networks’ origins. Moreover, clustering the vertices of graphs regarding their triadic structure, we investigate structural groups in the neural network of C. elegans, the international airport-connection network, and the global network of diplomatic sentiments between countries. For the latter we find evidence for the instability of triangles considered socially unbalanced according to sociological theories.
Finally, we utilize our TRGM to explore ensembles of networks with similar triadic substructure in terms of the evolution of dynamical processes acting on their nodes. Focusing on oscillators, coupled along the graphs’ edges, we observe that certain triad motifs impose a clear signature on the systems’ dynamics, even when embedded in a larger
network structure.
The subject of this thesis is the controllability of interconnected linear systems, where the interconnection parameter are the control variables. The study of accessibility and controllability of bilinear systems is closely related to their system Lie algebra. In 1976, Brockett classified all possible system Lie algebras of linear single-input, single-output (SISO) systems under time-varying output feedback. Here, Brockett's results are generalized to networks of linear systems, where time-varying output feedback is applied according to the interconnection structure of the network. First, networks of linear SISO systems are studied and it is assumed that all interconnections are independently controllable. By calculating the system Lie algebra it is shown that accessibility of the controlled network is equivalent to the strong connectedness of the underlying interconnection graph in case the network has at least three subsystems. Networks with two subsystems are not captured by these proofs. Thus, we give results for this particular case under additional assumption either on the graph structure or on the dynamics of the node systems, which are both not necessary. Additionally, the system Lie algebra is studied in case the interconnection graph is not strongly connected. Then, we show how to adapt the ideas of proof to networks of multi-input, multi-output (MIMO) systems. We generalize results for the system Lie algebra on networks of MIMO systems both under output feedback and under restricted output feedback. Moreover, the case with generalized interconnections is studied, i.e. parallel edges and linear dependencies in the interconnection controls are allowed. The new setting demands to distinguish between homogeneous and heterogeneous networks. With this new setting only sufficient conditions can be found to guarantee accessibility of the controlled network. As an example, networks with Toeplitz interconnection structure are studied.
Nowadays, data centers are becoming increasingly dynamic due to the common adoption of virtualization technologies. Systems can scale their capacity on demand by growing and shrinking their resources dynamically based on the current load. However, the complexity and performance of modern data centers is influenced not only by the software architecture, middleware, and computing resources, but also by network virtualization, network protocols, network services, and configuration. The field of network virtualization is not as mature as server virtualization and there are multiple competing approaches and technologies. Performance modeling and prediction techniques provide a powerful tool to analyze the performance of modern data centers. However, given the wide variety of network virtualization approaches, no common approach exists for modeling and evaluating the performance of virtualized networks.
The performance community has proposed multiple formalisms and models for evaluating the performance of infrastructures based on different network virtualization technologies. The existing performance models can be divided into two main categories: coarse-grained analytical models and highly-detailed simulation models. Analytical performance models are normally defined at a high level of abstraction and thus they abstract many details of the real network and therefore have limited predictive power. On the other hand, simulation models are normally focused on a selected networking technology and take into account many specific performance influencing factors, resulting in detailed models that are tightly bound to a given technology, infrastructure setup, or to a given protocol stack.
Existing models are inflexible, that means, they provide a single solution method without providing means for the user to influence the solution accuracy and solution overhead. To allow for flexibility in the performance prediction, the user is required to build multiple different performance models obtaining multiple performance predictions. Each performance prediction may then have different focus, different performance metrics, prediction accuracy, and solving time.
The goal of this thesis is to develop a modeling approach that does not require the user to have experience in any of the applied performance modeling formalisms. The approach offers the flexibility in the modeling and analysis by balancing between: (a) generic character and low overhead of coarse-grained analytical models, and (b) the more detailed simulation models with higher prediction accuracy.
The contributions of this thesis intersect with technologies and research areas, such as: software engineering, model-driven software development, domain-specific modeling, performance modeling and prediction, networking and data center networks, network virtualization, Software-Defined Networking (SDN), Network Function Virtualization (NFV). The main contributions of this thesis compose the Descartes Network Infrastructure (DNI) approach and include:
• Novel modeling abstractions for virtualized network infrastructures. This includes two meta-models that define modeling languages for modeling data center network performance. The DNI and miniDNI meta-models provide means for representing network infrastructures at two different abstraction levels. Regardless of which variant of the DNI meta-model is used, the modeling language provides generic modeling elements allowing to describe the majority of existing and future network technologies, while at the same time abstracting factors that have low influence on the overall performance. I focus on SDN and NFV as examples of modern virtualization technologies.
• Network deployment meta-model—an interface between DNI and other meta- models that allows to define mapping between DNI and other descriptive models. The integration with other domain-specific models allows capturing behaviors that are not reflected in the DNI model, for example, software bottlenecks, server virtualization, and middleware overheads.
• Flexible model solving with model transformations. The transformations enable solving a DNI model by transforming it into a predictive model. The model transformations vary in size and complexity depending on the amount of data abstracted in the transformation process and provided to the solver. In this thesis, I contribute six transformations that transform DNI models into various predictive models based on the following modeling formalisms: (a) OMNeT++ simulation, (b) Queueing Petri Nets (QPNs), (c) Layered Queueing Networks (LQNs). For each of these formalisms, multiple predictive models are generated (e.g., models with different level of detail): (a) two for OMNeT++, (b) two for QPNs, (c) two for LQNs. Some predictive models can be solved using multiple alternative solvers resulting in up to ten different automated solving methods for a single DNI model.
• A model extraction method that supports the modeler in the modeling process by automatically prefilling the DNI model with the network traffic data. The contributed traffic profile abstraction and optimization method provides a trade-off by balancing between the size and the level of detail of the extracted profiles.
• A method for selecting feasible solving methods for a DNI model. The method proposes a set of solvers based on trade-off analysis characterizing each transformation with respect to various parameters such as its specific limitations, expected prediction accuracy, expected run-time, required resources in terms of CPU and memory consumption, and scalability.
• An evaluation of the approach in the context of two realistic systems. I evaluate the approach with focus on such factors like: prediction of network capacity and interface throughput, applicability, flexibility in trading-off between prediction accuracy and solving time. Despite not focusing on the maximization of the prediction accuracy, I demonstrate that in the majority of cases, the prediction error is low—up to 20% for uncalibrated models and up to 10% for calibrated models depending on the solving technique.
In summary, this thesis presents the first approach to flexible run-time performance prediction in data center networks, including network based on SDN. It provides ability to flexibly balance between performance prediction accuracy and solving overhead. The approach provides the following key benefits:
• It is possible to predict the impact of changes in the data center network on the performance. The changes include: changes in network topology, hardware configuration, traffic load, and applications deployment.
• DNI can successfully model and predict the performance of multiple different of network infrastructures including proactive SDN scenarios.
• The prediction process is flexible, that is, it provides balance between the granularity of the predictive models and the solving time. The decreased prediction accuracy is usually rewarded with savings of the solving time and consumption of resources required for solving.
• The users are enabled to conduct performance analysis using multiple different prediction methods without requiring the expertise and experience in each of the modeling formalisms.
The components of the DNI approach can be also applied to scenarios that are not considered in this thesis. The approach is generalizable and applicable for the following examples: (a) networks outside of data centers may be analyzed with DNI as long as the background traffic profile is known; (b) uncalibrated DNI models may serve as a basis for design-time performance analysis; (c) the method for extracting and compacting of traffic profiles may be used for other, non-network workloads as well.
Understanding a complex network’s structure holds the key to understanding its function. The physics community has contributed a multitude of methods and analyses to this cross-disciplinary endeavor. Structural features exist on both the microscopic level, resulting from differences between single node properties, and the mesoscopic level resulting from properties shared by groups of nodes. Disentangling the determinants of network structure on these different scales has remained a major, and so far unsolved, challenge. Here we show how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems. It sheds new light on the statistical significance of motif-distributions in neural networks and improves the link-prediction accuracy as exemplified for gene-disease associations in the highly consequential Online Mendelian Inheritance in Man database.
It is widely believed that the modular organization of cellular function is reflected in a modular structure of molecular networks. A common view is that a ‘‘module’’ in a network is a cohesively linked group of nodes, densely connected internally and sparsely interacting with the rest of the network. Many algorithms try to identify functional modules in protein-interaction networks (PIN) by searching for such cohesive groups of proteins. Here, we present an alternative approach independent of any prior definition of what actually constitutes a ‘‘module’’. In a self-consistent manner, proteins are grouped into ‘‘functional roles’’ if they interact in similar ways with other proteins according to their functional roles. Such grouping may well result in cohesive modules again, but only if the network structure actually supports this. We applied our method to the PIN from the Human Protein Reference Database (HPRD) and found that a representation of the network in terms of cohesive modules, at least on a global scale, does not optimally represent the network’s structure because it focuses on finding independent groups of proteins. In contrast, a decomposition into functional roles is able to depict the structure much better as it also takes into account the interdependencies between roles and even allows groupings based on the absence of interactions between proteins in the same functional role. This, for example, is the case for transmembrane proteins, which could never be recognized as a cohesive group of nodes in a PIN. When mapping experimental methods onto the groups, we identified profound differences in the coverage suggesting that our method is able to capture experimental bias in the data, too. For example yeast-two-hybrid data were highly overrepresented in one particular group. Thus, there is more structure in protein-interaction networks than cohesive modules alone and we believe this finding can significantly improve automated function prediction algorithms.
This dissertation focuses on the performance evaluation of all components of Software Defined Networking (SDN) networks and covers whole their architecture. First, the isolation between virtual networks sharing the same physical resources is investigated with SDN switches of several vendors. Then, influence factors on the isolation are identified and evaluated. Second, the impact of control mechanisms on the performance of the data plane is examined through the flow rule installation time of SDN switches with different controllers. It is shown that both hardware-specific and controller instance have a specific influence on the installation time. Finally, several traffic flow monitoring methods of an SDN controller are investigated and a new monitoring approach is developed and evaluated. It is confirmed that the proposed method allows monitoring of particular flows as well as consumes fewer resources than the standard approach. Based on findings in this thesis, on the one hand, controller developers can refer to the work related to the control plane, such as flow monitoring or flow rule installation, to improve the performance of their applications. On the other hand, network administrators can apply the presented methods to select a suitable combination of controller and switches in their SDN networks, based on their performance requirements
With the introduction of Software-defined Networking (SDN) in the late 2000s, not only a new research field has been created, but a paradigm shift was initiated in the broad field of networking. The programmable network control by SDN is a big step, but also a stumbling block for many of the established network operators and vendors. As with any new technology the question about the maturity and the productionreadiness of it arises. Therefore, this thesis picks specific features of SDN and analyzes its performance, reliability, and availability in scenarios that can be expected in production deployments.
The first SDN topic is the performance impact of application traffic in the data plane on the control plane. Second, reliability and availability concerns of SDN deployments are exemplary analyzed by evaluating the detection performance of a common SDN controller. Thirdly, the performance of P4, a technology that enhances SDN, or better its impact of certain control operations on the processing performance is evaluated.
This work is subdivided into two main areas: resilient admission control and resilient routing. The work gives an overview of the state of the art of quality of service mechanisms in communication networks and proposes a categorization of admission control (AC) methods. These approaches are investigated regarding performance, more precisely, regarding the potential resource utilization by dimensioning the capacity for a network with a given topology, traffic matrix, and a required flow blocking probability. In case of a failure, the affected traffic is rerouted over backup paths which increases the traffic rate on the respective links. To guarantee the effectiveness of admission control also in failure scenarios, the increased traffic rate must be taken into account for capacity dimensioning and leads to resilient AC. Capacity dimensioning is not feasible for existing networks with already given link capacities. For the application of resilient NAC in this case, the size of distributed AC budgets must be adapted according to the traffic matrix in such a way that the maximum blocking probability for all flows is minimized and that the capacity of all links is not exceeded by the admissible traffic rate in any failure scenario. Several algorithms for the solution of that problem are presented and compared regarding their efficiency and fairness. A prototype for resilient AC was implemented in the laboratories of Siemens AG in Munich within the scope of the project KING. Resilience requires additional capacity on the backup paths for failure scenarios. The amount of this backup capacity depends on the routing and can be minimized by routing optimization. New protection switching mechanisms are presented that deviate the traffic quickly around outage locations. They are simple and can be implemented, e.g, by MPLS technology. The Self-Protecting Multi-Path (SPM) is a multi-path consisting of disjoint partial paths. The traffic is distributed over all faultless partial paths according to an optimized load balancing function both in the working case and in failure scenarios. Performance studies show that the network topology and the traffic matrix also influence the amount of required backup capacity significantly. The example of the COST-239 network illustrates that conventional shortest path routing may need 50% more capacity than the optimized SPM if all single link and node failures are protected.
Recent advances in Natural Language Preprocessing (NLP) allow for a fully automatic extraction of character networks for an incoming text. These networks serve as a compact and easy to grasp representation of literary fiction. They offer an aggregated view of the text, which can be used during distant reading approaches for the analysis of literary hypotheses. In their core, the networks consist of nodes, which represent literary characters, and edges, which represent relations between characters. For an automatic extraction of such a network, the first step is the detection of the references of all fictional entities that are of importance for a text. References to the fictional entities appear in the form of names, noun phrases and pronouns and prior to this work, no components capable of automatic detection of character references were available. Existing tools are only capable of detecting proper nouns, a subset of all character references. When evaluated on the task of detecting proper nouns in the domain of literary fiction, they still underperform at an F1-score of just about 50%. This thesis uses techniques from the field of semi-supervised learning, such as Distant supervision and Generalized Expectations, and improves the results of an existing tool to about 82%, when evaluated on all three categories in literary fiction, but without the need for annotated data in the target domain. However, since this quality is still not sufficient, the decision to annotate DROC, a corpus comprising 90 fragments of German novels was made. This resulted in a new general purpose annotation environment titled as ATHEN, as well as annotated data that spans about 500.000 tokens in total. Using this data, the combination of supervised algorithms and a tailored rule based algorithm, which in combination are able to exploit both - local consistencies as well as global consistencies - yield an algorithm with an F1-score of about 93%. This component is referred to as the Kallimachos tagger.
A character network can not directly display references however, instead they need to be clustered so that all references that belong to a real world or fictional entity are grouped together. This process widely known as coreference resolution is a hard problem in the focus of research for more than half a century. This work experimented with adaptations of classical feature based machine learning, with a dedicated rule based algorithm and with modern techniques of Deep Learning, but no approach can surpass 55% B-Cubed F1, when evaluated on DROC. Due to this barrier, many researchers do not use a fully-fledged coreference resolution when they extract character networks, but only focus on a more forgiving subset- the names. For novels such as Alice's Adventures in Wonderland by Lewis Caroll, this would however only result in a network in which many important characters are missing. In order to integrate important characters into the network that are not named by the author, this work makes use of automatic detection of speaker and addressees for direct speech utterances (all entities involved in a dialog are considered to be of importance). This problem is by itself not an easy task, however the most successful system analysed in this thesis is able to correctly determine the speaker to about 85% of the utterances as well as about 65% of the addressees. This speaker information can not only help to identify the most dominant characters, but also serves as a way to model the relations between entities.
During the span of this work, components have been developed to model relations between characters using speaker attribution, using co-occurrences as well as by the usage of true interactions, for which yet again a dataset was annotated using ATHEN. Furthermore, since relations between characters are usually typed, a component for the extraction of a typed relation was developed. Similar to the experiments for the character reference detection, a combination of a rule based and a Maximum Entropy classifier yielded the best overall results, with the extraction of family relations showing a score of about 80% and the quality of love relations with a score of about 50%. For family relations, a kernel for a Support Vector Machine was developed that even exceeded the scores of the combined approach but is behind on the other labels.
In addition, this work presents new ways to evaluate automatically extracted networks without the need of domain experts, instead it relies on the usage of expert summaries. It also refrains from the uses of social network analysis for the evaluation, but instead presents ranked evaluations using Precision@k and the Spearman Rank correlation coefficient for the evaluation of the nodes and edges of the network. An analysis using these metrics showed, that the central characters of a novel are contained with high probability but the quality drops rather fast if more than five entities are analyzed. The quality of the edges is mainly dominated by the quality of the coreference resolution and the correlation coefficient between gold edges and system edges therefore varies between 30 and 60%.
All developed components are aggregated alongside a large set of other preprocessing modules in the Kallimachos pipeline and can be reused without any restrictions.