Refine
Has Fulltext
- yes (16)
Is part of the Bibliography
- yes (16)
Year of publication
Document Type
- Doctoral Thesis (14)
- Journal article (2)
Language
- English (16) (remove)
Keywords
- Netzwerk (16) (remove)
It is widely believed that the modular organization of cellular function is reflected in a modular structure of molecular networks. A common view is that a ‘‘module’’ in a network is a cohesively linked group of nodes, densely connected internally and sparsely interacting with the rest of the network. Many algorithms try to identify functional modules in protein-interaction networks (PIN) by searching for such cohesive groups of proteins. Here, we present an alternative approach independent of any prior definition of what actually constitutes a ‘‘module’’. In a self-consistent manner, proteins are grouped into ‘‘functional roles’’ if they interact in similar ways with other proteins according to their functional roles. Such grouping may well result in cohesive modules again, but only if the network structure actually supports this. We applied our method to the PIN from the Human Protein Reference Database (HPRD) and found that a representation of the network in terms of cohesive modules, at least on a global scale, does not optimally represent the network’s structure because it focuses on finding independent groups of proteins. In contrast, a decomposition into functional roles is able to depict the structure much better as it also takes into account the interdependencies between roles and even allows groupings based on the absence of interactions between proteins in the same functional role. This, for example, is the case for transmembrane proteins, which could never be recognized as a cohesive group of nodes in a PIN. When mapping experimental methods onto the groups, we identified profound differences in the coverage suggesting that our method is able to capture experimental bias in the data, too. For example yeast-two-hybrid data were highly overrepresented in one particular group. Thus, there is more structure in protein-interaction networks than cohesive modules alone and we believe this finding can significantly improve automated function prediction algorithms.
Understanding a complex network’s structure holds the key to understanding its function. The physics community has contributed a multitude of methods and analyses to this cross-disciplinary endeavor. Structural features exist on both the microscopic level, resulting from differences between single node properties, and the mesoscopic level resulting from properties shared by groups of nodes. Disentangling the determinants of network structure on these different scales has remained a major, and so far unsolved, challenge. Here we show how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems. It sheds new light on the statistical significance of motif-distributions in neural networks and improves the link-prediction accuracy as exemplified for gene-disease associations in the highly consequential Online Mendelian Inheritance in Man database.
Nowadays, data centers are becoming increasingly dynamic due to the common adoption of virtualization technologies. Systems can scale their capacity on demand by growing and shrinking their resources dynamically based on the current load. However, the complexity and performance of modern data centers is influenced not only by the software architecture, middleware, and computing resources, but also by network virtualization, network protocols, network services, and configuration. The field of network virtualization is not as mature as server virtualization and there are multiple competing approaches and technologies. Performance modeling and prediction techniques provide a powerful tool to analyze the performance of modern data centers. However, given the wide variety of network virtualization approaches, no common approach exists for modeling and evaluating the performance of virtualized networks.
The performance community has proposed multiple formalisms and models for evaluating the performance of infrastructures based on different network virtualization technologies. The existing performance models can be divided into two main categories: coarse-grained analytical models and highly-detailed simulation models. Analytical performance models are normally defined at a high level of abstraction and thus they abstract many details of the real network and therefore have limited predictive power. On the other hand, simulation models are normally focused on a selected networking technology and take into account many specific performance influencing factors, resulting in detailed models that are tightly bound to a given technology, infrastructure setup, or to a given protocol stack.
Existing models are inflexible, that means, they provide a single solution method without providing means for the user to influence the solution accuracy and solution overhead. To allow for flexibility in the performance prediction, the user is required to build multiple different performance models obtaining multiple performance predictions. Each performance prediction may then have different focus, different performance metrics, prediction accuracy, and solving time.
The goal of this thesis is to develop a modeling approach that does not require the user to have experience in any of the applied performance modeling formalisms. The approach offers the flexibility in the modeling and analysis by balancing between: (a) generic character and low overhead of coarse-grained analytical models, and (b) the more detailed simulation models with higher prediction accuracy.
The contributions of this thesis intersect with technologies and research areas, such as: software engineering, model-driven software development, domain-specific modeling, performance modeling and prediction, networking and data center networks, network virtualization, Software-Defined Networking (SDN), Network Function Virtualization (NFV). The main contributions of this thesis compose the Descartes Network Infrastructure (DNI) approach and include:
• Novel modeling abstractions for virtualized network infrastructures. This includes two meta-models that define modeling languages for modeling data center network performance. The DNI and miniDNI meta-models provide means for representing network infrastructures at two different abstraction levels. Regardless of which variant of the DNI meta-model is used, the modeling language provides generic modeling elements allowing to describe the majority of existing and future network technologies, while at the same time abstracting factors that have low influence on the overall performance. I focus on SDN and NFV as examples of modern virtualization technologies.
• Network deployment meta-model—an interface between DNI and other meta- models that allows to define mapping between DNI and other descriptive models. The integration with other domain-specific models allows capturing behaviors that are not reflected in the DNI model, for example, software bottlenecks, server virtualization, and middleware overheads.
• Flexible model solving with model transformations. The transformations enable solving a DNI model by transforming it into a predictive model. The model transformations vary in size and complexity depending on the amount of data abstracted in the transformation process and provided to the solver. In this thesis, I contribute six transformations that transform DNI models into various predictive models based on the following modeling formalisms: (a) OMNeT++ simulation, (b) Queueing Petri Nets (QPNs), (c) Layered Queueing Networks (LQNs). For each of these formalisms, multiple predictive models are generated (e.g., models with different level of detail): (a) two for OMNeT++, (b) two for QPNs, (c) two for LQNs. Some predictive models can be solved using multiple alternative solvers resulting in up to ten different automated solving methods for a single DNI model.
• A model extraction method that supports the modeler in the modeling process by automatically prefilling the DNI model with the network traffic data. The contributed traffic profile abstraction and optimization method provides a trade-off by balancing between the size and the level of detail of the extracted profiles.
• A method for selecting feasible solving methods for a DNI model. The method proposes a set of solvers based on trade-off analysis characterizing each transformation with respect to various parameters such as its specific limitations, expected prediction accuracy, expected run-time, required resources in terms of CPU and memory consumption, and scalability.
• An evaluation of the approach in the context of two realistic systems. I evaluate the approach with focus on such factors like: prediction of network capacity and interface throughput, applicability, flexibility in trading-off between prediction accuracy and solving time. Despite not focusing on the maximization of the prediction accuracy, I demonstrate that in the majority of cases, the prediction error is low—up to 20% for uncalibrated models and up to 10% for calibrated models depending on the solving technique.
In summary, this thesis presents the first approach to flexible run-time performance prediction in data center networks, including network based on SDN. It provides ability to flexibly balance between performance prediction accuracy and solving overhead. The approach provides the following key benefits:
• It is possible to predict the impact of changes in the data center network on the performance. The changes include: changes in network topology, hardware configuration, traffic load, and applications deployment.
• DNI can successfully model and predict the performance of multiple different of network infrastructures including proactive SDN scenarios.
• The prediction process is flexible, that is, it provides balance between the granularity of the predictive models and the solving time. The decreased prediction accuracy is usually rewarded with savings of the solving time and consumption of resources required for solving.
• The users are enabled to conduct performance analysis using multiple different prediction methods without requiring the expertise and experience in each of the modeling formalisms.
The components of the DNI approach can be also applied to scenarios that are not considered in this thesis. The approach is generalizable and applicable for the following examples: (a) networks outside of data centers may be analyzed with DNI as long as the background traffic profile is known; (b) uncalibrated DNI models may serve as a basis for design-time performance analysis; (c) the method for extracting and compacting of traffic profiles may be used for other, non-network workloads as well.
The subject of this thesis is the controllability of interconnected linear systems, where the interconnection parameter are the control variables. The study of accessibility and controllability of bilinear systems is closely related to their system Lie algebra. In 1976, Brockett classified all possible system Lie algebras of linear single-input, single-output (SISO) systems under time-varying output feedback. Here, Brockett's results are generalized to networks of linear systems, where time-varying output feedback is applied according to the interconnection structure of the network. First, networks of linear SISO systems are studied and it is assumed that all interconnections are independently controllable. By calculating the system Lie algebra it is shown that accessibility of the controlled network is equivalent to the strong connectedness of the underlying interconnection graph in case the network has at least three subsystems. Networks with two subsystems are not captured by these proofs. Thus, we give results for this particular case under additional assumption either on the graph structure or on the dynamics of the node systems, which are both not necessary. Additionally, the system Lie algebra is studied in case the interconnection graph is not strongly connected. Then, we show how to adapt the ideas of proof to networks of multi-input, multi-output (MIMO) systems. We generalize results for the system Lie algebra on networks of MIMO systems both under output feedback and under restricted output feedback. Moreover, the case with generalized interconnections is studied, i.e. parallel edges and linear dependencies in the interconnection controls are allowed. The new setting demands to distinguish between homogeneous and heterogeneous networks. With this new setting only sufficient conditions can be found to guarantee accessibility of the controlled network. As an example, networks with Toeplitz interconnection structure are studied.
In the course of the growth of the Internet and due to increasing availability of data, over the last two decades, the field of network science has established itself as an own area of research. With quantitative scientists from computer science, mathematics, and physics working on datasets from biology, economics, sociology, political sciences, and many others, network science serves as a paradigm for interdisciplinary research.
One of the major goals in network science is to unravel the relationship between topological graph structure and a network’s function. As evidence suggests, systems from the same fields, i.e. with similar function, tend to exhibit similar structure. However, it is still vague whether a similar graph structure automatically implies likewise function. This dissertation aims at helping to bridge this gap, while particularly focusing on the role of triadic structures.
After a general introduction to the main concepts of network science, existing work devoted to the relevance of triadic substructures is reviewed. A major challenge in modeling triadic structure is the fact that not all three-node subgraphs can be specified independently
of each other, as pairs of nodes may participate in multiple of those triadic subgraphs.
In order to overcome this obstacle, we suggest a novel class of generative network models based on so called Steiner triple systems. The latter are partitions of a graph’s vertices into pair-disjoint triples (Steiner triples). Thus, the configurations on Steiner triples can be specified independently of each other without overdetermining the network’s link
structure.
Subsequently, we investigate the most basic realization of this new class of models. We call it the triadic random graph model (TRGM). The TRGM is parametrized by a probability distribution over all possible triadic subgraph patterns. In order to generate a network instantiation of the model, for all Steiner triples in the system, a pattern is drawn from the distribution and adjusted randomly on the Steiner triple. We calculate the degree distribution of the TRGM analytically and find it to be similar to a Poissonian distribution. Furthermore, it is shown that TRGMs possess non-trivial triadic structure. We discover inevitable correlations in the abundance of certain triadic subgraph
patterns which should be taken into account when attributing functional relevance to particular motifs – patterns which occur significantly more frequently than expected at random. Beyond, the strong impact of the probability distributions on the Steiner triples on the occurrence of triadic subgraphs over the whole network is demonstrated. This interdependence allows us to design ensembles of networks with predefined triadic substructure. Hence, TRGMs help to overcome the lack of generative models needed for assessing the relevance of triadic structure.
We further investigate whether motifs occur homogeneously or heterogeneously distributed over a graph. Therefore, we study triadic subgraph structures in each node’s neighborhood individually. In order to quantitatively measure structure from an individual node’s perspective, we introduce an algorithm for node-specific pattern mining for both directed unsigned, and undirected signed networks. Analyzing real-world datasets, we find that there are networks in which motifs are distributed highly heterogeneously, bound to the proximity of only very few nodes. Moreover, we observe indication for the potential sensitivity of biological systems to a targeted removal of these critical vertices. In addition, we study whole graphs with respect to the homogeneity and homophily of their node-specific triadic structure. The former describes the similarity of subgraph distributions in the neighborhoods of individual vertices. The latter quantifies whether connected vertices
are structurally more similar than non-connected ones. We discover these features to be characteristic for the networks’ origins. Moreover, clustering the vertices of graphs regarding their triadic structure, we investigate structural groups in the neural network of C. elegans, the international airport-connection network, and the global network of diplomatic sentiments between countries. For the latter we find evidence for the instability of triangles considered socially unbalanced according to sociological theories.
Finally, we utilize our TRGM to explore ensembles of networks with similar triadic substructure in terms of the evolution of dynamical processes acting on their nodes. Focusing on oscillators, coupled along the graphs’ edges, we observe that certain triad motifs impose a clear signature on the systems’ dynamics, even when embedded in a larger
network structure.
In this thesis we study various aspects of chaos synchronization of time-delayed coupled chaotic maps. A network of identical nonlinear units interacting by time-delayed couplings can synchronize to a common chaotic trajectory. Even for large delay times the system can completely synchronize without any time shift. In the first part we study chaotic systems with multiple time delays that range over several orders of magnitude. We show that these time scales emerge in the Lyapunov spectrum: Different parts of the spectrum scale with the different delays. We define various types of chaos depending on the scaling of the maximum exponent. The type of chaos determines the synchronization ability of coupled networks. This is, in particular, relevant for the synchronization properties of networks of networks where time delays within a subnetwork are shorter than the corresponding time delays between the different subnetworks. If the maximum Lyapunov exponent scales with the short intra-network delay, only the elements within a subnetwork can synchronize. If, however, the maximum Lyapunov exponent scales with the long inter-network connection, complete synchronization of all elements is possible. The results are illustrated analytically for Bernoulli maps and numerically for tent maps. In the second part the attractor dimension at the transition to complete chaos synchronization is investigated. In particular, we determine the Kaplan-Yorke dimension from the spectrum of Lyapunov exponents for iterated maps. We argue that the Kaplan-Yorke dimension must be discontinuous at the transition and compare it to the correlation dimension. For a system of Bernoulli maps we indeed find a jump in the correlation dimension. The magnitude of the discontinuity in the Kaplan-Yorke dimension is calculated for networks of Bernoulli units as a function of the network size. Furthermore the scaling of the Kaplan-Yorke dimension as well as of the Kolmogorov entropy with system size and time delay is investigated. Finally, we study the change in the attractor dimension for systems with parameter mismatch. In the third and last part the linear response of synchronized chaotic systems to small external perturbations is studied. The distribution of the distances from the synchronization manifold, i.e., the deviations between two synchronized chaotic units due to external perturbations on the transmitted signal, is used as a measure of the linear response. It is calculated numerically and, for some special cases, analytically. Depending on the model parameters this distribution has power law tails in the region of synchronization leading to diverging moments. The linear response is also quantified by means of the bit error rate of a transmitted binary message which perturbs the synchronized system. The bit error rate is given by an integral over the distribution of distances and is studied numerically for Bernoulli, tent and logistic maps. It displays a complex nonmonotonic behavior in the region of synchronization. For special cases the distribution of distances has a fractal structure leading to a devil's staircase for the bit error rate as a function of coupling strength. The response to small harmonic perturbations shows resonances related to coupling and feedback delay times. A bi-directionally coupled chain of three units can completely filter out the perturbation. Thus the second moment and the bit error rate become zero.