004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (183)
Year of publication
Document Type
- Journal article (74)
- Doctoral Thesis (63)
- Working Paper (37)
- Conference Proceeding (7)
- Report (2)
Language
- English (183) (remove)
Keywords
- Datennetz (14)
- Leistungsbewertung (13)
- virtual reality (12)
- Robotik (8)
- Mobiler Roboter (6)
- Autonomer Roboter (5)
- Komplexitätstheorie (5)
- Optimierung (5)
- P4 (5)
- Theoretische Informatik (5)
Institute
- Institut für Informatik (183) (remove)
Schriftenreihe
Sonstige beteiligte Institutionen
The steadily increasing usage of smart meters generates a valuable amount of high-resolution data about the individual energy consumption and production of local energy systems. Private households install more and more photovoltaic systems, battery storage and big consumers like heat pumps. Thus, our vision is to augment these collected smart meter time series of a complete system (e.g., a city, town or complex institutions like airports) with simulatively added previously named components. We, therefore, propose a novel digital twin of such an energy system based solely on a complete set of smart meter data including additional building data. Based on the additional geospatial data, the twin is intended to represent the addition of the abovementioned components as realistically as possible. Outputs of the twin can be used as a decision support for either system operators where to strengthen the system or for individual households where and how to install photovoltaic systems and batteries. Meanwhile, the first local energy system operators had such smart meter data of almost all residential consumers for several years. We acquire those of an exemplary operator and discuss a case study presenting some features of our digital twin and highlighting the value of the combination of smart meter and geospatial data.
Background
Colorectal cancer is a leading cause of cancer-related deaths worldwide. The best method to prevent CRC is a colonoscopy. However, not all colon polyps have the risk of becoming cancerous. Therefore, polyps are classified using different classification systems. After the classification, further treatment and procedures are based on the classification of the polyp. Nevertheless, classification is not easy. Therefore, we suggest two novel automated classifications system assisting gastroenterologists in classifying polyps based on the NICE and Paris classification.
Methods
We build two classification systems. One is classifying polyps based on their shape (Paris). The other classifies polyps based on their texture and surface patterns (NICE). A two-step process for the Paris classification is introduced: First, detecting and cropping the polyp on the image, and secondly, classifying the polyp based on the cropped area with a transformer network. For the NICE classification, we design a few-shot learning algorithm based on the Deep Metric Learning approach. The algorithm creates an embedding space for polyps, which allows classification from a few examples to account for the data scarcity of NICE annotated images in our database.
Results
For the Paris classification, we achieve an accuracy of 89.35 %, surpassing all papers in the literature and establishing a new state-of-the-art and baseline accuracy for other publications on a public data set. For the NICE classification, we achieve a competitive accuracy of 81.13 % and demonstrate thereby the viability of the few-shot learning paradigm in polyp classification in data-scarce environments. Additionally, we show different ablations of the algorithms. Finally, we further elaborate on the explainability of the system by showing heat maps of the neural network explaining neural activations.
Conclusion
Overall we introduce two polyp classification systems to assist gastroenterologists. We achieve state-of-the-art performance in the Paris classification and demonstrate the viability of the few-shot learning paradigm in the NICE classification, addressing the prevalent data scarcity issues faced in medical machine learning.
Scalability is often mentioned in literature, but a stringent definition is missing. In particular, there is no general scalability assessment which clearly indicates whether a system scales or not or whether a system scales better than another. The key contribution of this article is the definition of a scalability index (SI) which quantifies if a system scales in comparison to another system, a hypothetical system, e.g., linear system, or the theoretically optimal system. The suggested SI generalizes different metrics from literature, which are specialized cases of our SI. The primary target of our scalability framework is, however, benchmarking of two systems, which does not require any reference system. The SI is demonstrated and evaluated for different use cases, that are (1) the performance of an IoT load balancer depending on the system load, (2) the availability of a communication system depending on the size and structure of the network, (3) scalability comparison of different location selection mechanisms in fog computing with respect to delays and energy consumption; (4) comparison of time-sensitive networking (TSN) mechanisms in terms of efficiency and utilization. Finally, we discuss how to use and how not to use the SI and give recommendations and guidelines in practice. To the best of our knowledge, this is the first work which provides a general SI for the comparison and benchmarking of systems, which is the primary target of our scalability analysis.
In recent history, normalized digital surface models (nDSMs) have been constantly gaining importance as a means to solve large-scale geographic problems. High-resolution surface models are precious, as they can provide detailed information for a specific area. However, measurements with a high resolution are time consuming and costly. Only a few approaches exist to create high-resolution nDSMs for extensive areas. This article explores approaches to extract high-resolution nDSMs from low-resolution Sentinel-2 data, allowing us to derive large-scale models. We thereby utilize the advantages of Sentinel 2 being open access, having global coverage, and providing steady updates through a high repetition rate. Several deep learning models are trained to overcome the gap in producing high-resolution surface maps from low-resolution input data. With U-Net as a base architecture, we extend the capabilities of our model by integrating tailored multiscale encoders with differently sized kernels in the convolution as well as conformed self-attention inside the skip connection gates. Using pixelwise regression, our U-Net base models can achieve a mean height error of approximately 2 m. Moreover, through our enhancements to the model architecture, we reduce the model error by more than 7%.
Background: Due to the importance of radiologic examinations, such as X-rays or computed tomography scans, for many clinical diagnoses, the optimal use of the radiology department is 1 of the primary goals of many hospitals.
Objective: This study aims to calculate the key metrics of this use by creating a radiology data warehouse solution, where data from radiology information systems (RISs) can be imported and then queried using a query language as well as a graphical user interface (GUI).
Methods: Using a simple configuration file, the developed system allowed for the processing of radiology data exported from any kind of RIS into a Microsoft Excel, comma-separated value (CSV), or JavaScript Object Notation (JSON) file. These data were then imported into a clinical data warehouse. Additional values based on the radiology data were calculated during this import process by implementing 1 of several provided interfaces. Afterward, the query language and GUI of the data warehouse were used to configure and calculate reports on these data. For the most common types of requested reports, a web interface was created to view their numbers as graphics.
Results: The tool was successfully tested with the data of 4 different German hospitals from 2018 to 2021, with a total of 1,436,111 examinations. The user feedback was good, since all their queries could be answered if the available data were sufficient. The initial processing of the radiology data for using them with the clinical data warehouse took (depending on the amount of data provided by each hospital) between 7 minutes and 1 hour 11 minutes. Calculating 3 reports of different complexities on the data of each hospital was possible in 1-3 seconds for reports with up to 200 individual calculations and in up to 1.5 minutes for reports with up to 8200 individual calculations.
Conclusions: A system was developed with the main advantage of being generic concerning the export of different RISs as well as concerning the configuration of queries for various reports. The queries could be configured easily using the GUI of the data warehouse, and their results could be exported into the standard formats Excel and CSV for further processing.
Group-based communication is a highly popular communication paradigm, which is especially prominent in mobile instant messaging (MIM) applications, such as WhatsApp. Chat groups in MIM applications facilitate the sharing of various types of messages (e.g., text, voice, image, video) among a large number of participants. As each message has to be transmitted to every other member of the group, which multiplies the traffic, this has a massive impact on the underlying communication networks. However, most chat groups are private and network operators cannot obtain deep insights into MIM communication via network measurements due to end-to-end encryption. Thus, the generation of traffic is not well understood, given that it depends on sizes of communication groups, speed of communication, and exchanged message types. In this work, we provide a huge data set of 5,956 private WhatsApp chat histories, which contains over 76 million messages from more than 117,000 users. We describe and model the properties of chat groups and users, and the communication within these chat groups, which gives unprecedented insights into private MIM communication. In addition, we conduct exemplary measurements for the most popular message types, which empower the provided models to estimate the traffic over time in a chat group.
Cooperative, connected and automated mobility (CCAM) systems depend on a reliable communication to provide their service and more crucially to ensure the safety of users. One way to ensure the reliability of a data transmission is to use multiple transmission technologies in combination with redundant flows. In this paper, we describe a system requiring multipath communication in the context of CCAM. To this end, we introduce a data plane-based scheduler that uses replication and integration modules to provide redundant and transparent multipath communication. We provide an analytical model for the full replication module of the system and give an overview of how and where the data-plane scheduler components can be realized.
Knowledge about ransomware is important for protecting sensitive data and for participating in public debates about suitable regulation regarding its security. However, as of now, this topic has received little to no attention in most school curricula. As such, it is desirable to analyze what citizens can learn about this topic outside of formal education, e.g., from news articles. This analysis is both relevant to analyzing the public discourse about ransomware, as well as to identify what aspects of this topic should be included in the limited time available for this topic in formal education. Thus, this paper was motivated both by educational and media research. The central goal is to explore how the media reports on this topic and, additionally, to identify potential misconceptions that could stem from this reporting. To do so, we conducted an exploratory case study into the reporting of 109 media articles regarding a high-impact ransomware event: the shutdown of the Colonial Pipeline (located in the east of the USA). We analyzed how the articles introduced central terminology, what details were provided, what details were not, and what (mis-)conceptions readers might receive from them. Our results show that an introduction of the terminology and technical concepts of security is insufficient for a complete understanding of the incident. Most importantly, the articles may lead to four misconceptions about ransomware that are likely to lead to misleading conclusions about the responsibility for the incident and possible political and technical options to prevent such attacks in the future.
Social patterns and roles can develop when users talk to intelligent voice assistants (IVAs) daily. The current study investigates whether users assign different roles to devices and how this affects their usage behavior, user experience, and social perceptions. Since social roles take time to establish, we equipped 106 participants with Alexa or Google assistants and some smart home devices and observed their interactions for nine months. We analyzed diverse subjective (questionnaire) and objective data (interaction data). By combining social science and data science analyses, we identified two distinct clusters—users who assigned a friendship role to IVAs over time and users who did not. Interestingly, these clusters exhibited significant differences in their usage behavior, user experience, and social perceptions of the devices. For example, participants who assigned a role to IVAs attributed more friendship to them used them more frequently, reported more enjoyment during interactions, and perceived more empathy for IVAs. In addition, these users had distinct personal requirements, for example, they reported more loneliness. This study provides valuable insights into the role-specific effects and consequences of voice assistants. Recent developments in conversational language models such as ChatGPT suggest that the findings of this study could make an important contribution to the design of dialogic human–AI interactions.
Digitization and transcription of historic documents offer new research opportunities for humanists and are the topics of many edition projects. However, manual work is still required for the main phases of layout recognition and the subsequent optical character recognition (OCR) of early printed documents. This paper describes and evaluates how deep learning approaches recognize text lines and can be extended to layout recognition using background knowledge. The evaluation was performed on five corpora of early prints from the 15th and 16th Centuries, representing a variety of layout features. While the main text with standard layouts could be recognized in the correct reading order with a precision and recall of up to 99.9%, also complex layouts were recognized at a rate as high as 90% by using background knowledge, the full potential of which was revealed if many pages of the same source were transcribed.
The ongoing digitization of historical photographs in archives allows investigating the quality, quantity, and distribution of these images. However, the exact interior and exterior camera orientations of these photographs are usually lost during the digitization process. The proposed method uses content-based image retrieval (CBIR) to filter exterior images of single buildings in combination with metadata information. The retrieved photographs are automatically processed in an adapted structure-from-motion (SfM) pipeline to determine the camera parameters. In an interactive georeferencing process, the calculated camera positions are transferred into a global coordinate system. As all image and camera data are efficiently stored in the proposed 4D database, they can be conveniently accessed afterward to georeference newly digitized images by using photogrammetric triangulation and spatial resection. The results show that the CBIR and the subsequent SfM are robust methods for various kinds of buildings and different quantity of data. The absolute accuracy of the camera positions after georeferencing lies in the range of a few meters likely introduced by the inaccurate LOD2 models used for transformation. The proposed photogrammetric method, the database structure, and the 4D visualization interface enable adding historical urban photographs and 3D models from other locations.
An important but very time consuming part of the research process is literature review. An already large and nevertheless growing ground set of publications as well as a steadily increasing publication rate continue to worsen the situation. Consequently, automating this task as far as possible is desirable. Experimental results of systems are key-insights of high importance during literature review and usually represented in form of tables. Our pipeline KIETA exploits these tables to contribute to the endeavor of automation by extracting them and their contained knowledge from scientific publications. The pipeline is split into multiple steps to guarantee modularity as well as analyzability, and agnosticim regarding the specific scientific domain up until the knowledge extraction step, which is based upon an ontology. Additionally, a dataset of corresponding articles has been manually annotated with information regarding table and knowledge extraction. Experiments show promising results that signal the possibility of an automated system, while also indicating limits of extracting knowledge from tables without any context.
Learning is a central component of human life and essential for personal development. Therefore, utilizing new technologies in the learning context and exploring their combined potential are considered essential to support self-directed learning in a digital age. A learning environment can be expanded by various technical and content-related aspects. Gamification in the form of elements from video games offers a potential concept to support the learning process. This can be supplemented by technology-supported learning. While the use of tablets is already widespread in the learning context, the integration of a social robot can provide new perspectives on the learning process. However, simply adding new technologies such as social robots or gamification to existing systems may not automatically result in a better learning environment. In the present study, game elements as well as a social robot were integrated separately and conjointly into a learning environment for basic Spanish skills, with a follow-up on retained knowledge. This allowed us to investigate the respective and combined effects of both expansions on motivation, engagement and learning effect. This approach should provide insights into the integration of both additions in an adult learning context. We found that the additions of game elements and the robot did not significantly improve learning, engagement or motivation. Based on these results and a literature review, we outline relevant factors for meaningful integration of gamification and social robots in learning environments in adult learning.
Climate models are the tool of choice for scientists researching climate change. Like all models they suffer from errors, particularly systematic and location-specific representation errors. One way to reduce these errors is model output statistics (MOS) where the model output is fitted to observational data with machine learning. In this work, we assess the use of convolutional Deep Learning climate MOS approaches and present the ConvMOS architecture which is specifically designed based on the observation that there are systematic and location-specific errors in the precipitation estimates of climate models. We apply ConvMOS models to the simulated precipitation of the regional climate model REMO, showing that a combination of per-location model parameters for reducing location-specific errors and global model parameters for reducing systematic errors is indeed beneficial for MOS performance. We find that ConvMOS models can reduce errors considerably and perform significantly better than three commonly used MOS approaches and plain ResNet and U-Net models in most cases. Our results show that non-linear MOS models underestimate the number of extreme precipitation events, which we alleviate by training models specialized towards extreme precipitation events with the imbalanced regression method DenseLoss. While we consider climate MOS, we argue that aspects of ConvMOS may also be beneficial in other domains with geospatial data, such as air pollution modeling or weather forecasts.
There is great interest in affordable, precise and reliable metrology underwater:
Archaeologists want to document artifacts in situ with high detail.
In marine research, biologists require the tools to monitor coral growth and geologists need recordings to model sediment transport.
Furthermore, for offshore construction projects, maintenance and inspection millimeter-accurate measurements of defects and offshore structures are essential.
While the process of digitizing individual objects and complete sites on land is well understood and standard methods, such as Structure from Motion or terrestrial laser scanning, are regularly applied, precise underwater surveying with high resolution is still a complex and difficult task.
Applying optical scanning techniques in water is challenging due to reduced visibility caused by turbidity and light absorption.
However, optical underwater scanners provide significant advantages in terms of achievable resolution and accuracy compared to acoustic systems.
This thesis proposes an underwater laser scanning system and the algorithms for creating dense and accurate 3D scans in water.
It is based on laser triangulation and the main optical components are an underwater camera and a cross-line laser projector.
The prototype is configured with a motorized yaw axis for capturing scans from a tripod.
Alternatively, it is mounted to a moving platform for mobile mapping.
The main focus lies on the refractive calibration of the underwater camera and laser projector, the image processing and 3D reconstruction.
For highest accuracy, the refraction at the individual media interfaces must be taken into account.
This is addressed by an optimization-based calibration framework using a physical-geometric camera model derived from an analytical formulation of a ray-tracing projection model.
In addition to scanning underwater structures, this work presents the 3D acquisition of semi-submerged structures and the correction of refraction effects.
As in-situ calibration in water is complex and time-consuming, the challenge of transferring an in-air scanner calibration to water without re-calibration is investigated, as well as self-calibration techniques for structured light.
The system was successfully deployed in various configurations for both static scanning and mobile mapping.
An evaluation of the calibration and 3D reconstruction using reference objects and a comparison of free-form surfaces in clear water demonstrate the high accuracy potential in the range of one millimeter to less than one centimeter, depending on the measurement distance.
Mobile underwater mapping and motion compensation based on visual-inertial odometry is demonstrated using a new optical underwater scanner based on fringe projection.
Continuous registration of individual scans allows the acquisition of 3D models from an underwater vehicle.
RGB images captured in parallel are used to create 3D point clouds of underwater scenes in full color.
3D maps are useful to the operator during the remote control of underwater vehicles and provide the building blocks to enable offshore inspection and surveying tasks.
The advancing automation of the measurement technology will allow non-experts to use it, significantly reduce acquisition time and increase accuracy, making underwater metrology more cost-effective.
Service orchestration requires enormous attention and is a struggle nowadays. Of course, virtualization provides a base level of abstraction for services to be deployable on a lot of infrastructures. With container virtualization, the trend to migrate applications to a micro-services level in order to be executable in Fog and Edge Computing environments increases manageability and maintenance efforts rapidly. Similarly, network virtualization adds effort to calibrate IP flows for Software-Defined Networks and eventually route it by means of Network Function Virtualization. Nevertheless, there are concepts like MAPE-K to support micro-service distribution in next-generation cloud and network environments. We want to explore, how a service distribution can be improved by adopting machine learning concepts for infrastructure or service changes. Therefore, we show how federated machine learning is integrated into a cloud-to-fog-continuum without burdening single nodes.
In network research, reproducibility of experiments is not always easy to achieve. Infrastructures are cumbersome to set up or are not available due to vendor-specific devices. Emulators try to overcome those issues to a given extent and are available in different service models. Unfortunately, the usability of emulators requires time-consuming efforts and a deep understanding of their functionality. At first, we analyze to which extent currently available open-source emulators support network configurations and how user-friendly they are. With these insights, we describe, how an ease-to-use emulator is implemented and may run as a Network Emulator as a Service (NEaaS). Therefore, virtualization plays a major role in order to deploy a NEaaS based on Kathará.
This paper discusses the problem of finding multiple shortest disjoint paths in modern communication networks, which is essential for ultra-reliable and time-sensitive applications. Dijkstra’s algorithm has been a popular solution for the shortest path problem, but repetitive use of it to find multiple paths is not scalable. The Multiple Disjoint Path Algorithm (MDPAlg), published in 2021, proposes the use of a single full graph to construct multiple disjoint paths. This paper proposes modifications to the algorithm to include a delay constraint, which is important in time-sensitive applications. Different delay constraint least-cost routing algorithms are compared in a comprehensive manner to evaluate the benefits of the adapted MDPAlg algorithm. Fault tolerance, and thereby reliability, is ensured by generating multiple link-disjoint paths from source to destination.
In this work, we describe the network from data collection to data processing and storage as a system based on different layers. We outline the different layers and highlight major tasks and dependencies with regard to energy consumption and energy efficiency. With this view, we can outwork challenges and questions a future system architect must answer to provide a more sustainable, green, resource friendly, and energy efficient application or system. Therefore, all system layers must be considered individually but also altogether for future IoT solutions. This requires, in particular, novel sustainability metrics in addition to current Quality of Service and Quality of Experience metrics to provide a high power, user satisfying, and sustainable network.
How to Model and Predict the Scalability of a Hardware-In-The-Loop Test Bench for Data Re-Injection?
(2023)
This paper describes a novel application of an empirical network calculus model based on measurements of a hardware-in-the-loop (HIL) test system. The aim is to predict the performance of a HIL test bench for open-loop re-injection in the context of scalability. HIL test benches are distributed computer systems including software, hardware, and networking devices. They are used to validate complex technical systems, but have not yet been system under study themselves. Our approach is to use measurements from the HIL system to create an empirical model for arrival and service curves. We predict the performance and design the previously unknown parameters of the HIL simulator with network calculus (NC), namely the buffer sizes and the minimum needed pre-buffer time for the playback buffer. We furthermore show, that it is possible to estimate the CPU load from arrival and service-curves based on the utilization theorem, and hence estimate the scalability of the HIL system in the context of the number of sensor streams.
In recent years, satellite communication has been expanding its field of application in the world of computer networks. This paper aims to provide an overview of how a typical scenario involving 5G Non-Terrestrial Networks (NTNs) for vehicle to everything (V2X) applications is characterized. In particular, a first implementation of a system that integrates them together will be described. Such a framework will later be used to evaluate the performance of applications such as Vehicle Monitoring (VM), Remote Driving (RD), Voice Over IP (VoIP), and others. Different configuration scenarios such as Low Earth Orbit and Geostationary Orbit will be considered.
The introduction of new types of frequency spectrum in 6G technology facilitates the convergence of conventional mobile communications and radar functions. Thus, the mobile network itself becomes a versatile sensor system. This enables mobile network operators to offer a sensing service in addition to conventional data and telephony services. The potential benefits are expected to accrue to various stakeholders, including individuals, the environment, and society in general. The paper discusses technological development, possible integration, and use cases, as well as future development areas.
In this paper, we work to understand the global IPX network from the perspective of an MVNO. In order to do this, we provide a brief description of the global architecture of mobile carriers. We provide initial results with respect to mapping the vast and complex interconnection network enabling global roaming from the point of view of a single MVNO. Finally, we provide preliminary results regarding the quality of service observed under global roaming conditions.
This paper presents a novel concept to extend state-of-the-art buffer monitoring with additional measures to estimate service-curves. The online algorithm for service-curve estimation replaces the state-of-the-art timestamp logging, as we expect it to overcome the main disadvantages of generating a huge amount of data and using a lot of CPU resources to store the data to a file during operation. We prove the accuracy of the online-algorithm offline with timestamp data and compare the derived bounds to the measured delay and backlog. We also do a proof-of- concept of the online-algorithm, implement it in LabVIEW and compare its performance to the timestamp logging by CPU load and data-size of the log-file. However, the implementation is still work-in-progress.
This paper presents a prototypical implementation of the In-band Network Telemetry (INT) specification in P4 and demonstrates a use case, where a Tofino Switch is used to measure device and network performance in a lab setting. This work is based on research activities in the area of P4 data plane programming conducted at the network lab of HTW Berlin.
State Management at line rate is crucial for critical applications in next-generation networks. P4 is a language used in software-defined networking to program the data plane. The data plane can profit in many circumstances when it is allowed to manage its state without any detour over a controller. This work is based on a previous study by investigating the potential and performance of add-on-miss insertions of state by the data plane. The state keeping capabilities of P4 are limited regarding the amount of data and the update frequency. We follow the tentative specification of an upcoming portable-NIC-architecture and implement these changes into the software P4 target T4P4S. We show that insertions are possible with only a slight overhead compared to lookups and evaluate the influence of the rate of insertions on their latency.
Understanding the Performance of Different Packet Reception and Timestamping Methods in Linux
(2023)
This document briefly presents some renowned packet reception techniques for network packets in Linux systems. Further, it compares their performance when measuring packet timestamps with respect to throughput and accuracy. Both software and hardware timestamps are compared, and various parameters are examined, including frame size, link speed, network interface card, and CPU load. The results indicate that hardware timestamping offers significantly better accuracy with no downsides, and that packet reception techniques that avoid system calls offer superior measurement throughput.
Utilizing multiple access networks such as 5G, 4G, and Wi-Fi simultaneously can lead to increased robustness, resiliency, and capacity for mobile users. However, transparently implementing packet distribution over multiple paths within the core of the network faces multiple challenges including scalability to a large number of customers, low latency, and high-capacity packet processing requirements. In this paper, we offload congestion-aware multipath packet scheduling to a smartNIC. However, such hardware acceleration faces multiple challenges due to programming language and platform limitations. We implement different multipath schedulers in P4 with different complexity in order to cope with dynamically changing path capacities. Using testbed measurements, we show that our CMon scheduler, which monitors path congestion in the data plane and dynamically adjusts scheduling weights for the different paths based on path state information, can process more than 3.5 Mpps packets 25 μs latency.
Web caches often use a Time-to-live (TTL) limit to validate data consistency with web servers. We study the impact of TTL constraints on the hit ratio of basic strategies in caches of fixed size. We derive analytical results and confirm their accuracy in comparison to simulations. We propose a score-based caching method with awareness of the current TTL per data for improving the hit ratio close to the upper bound.
Cooperative, connected and automated mobility (CCAM) systems depend on a reliable communication to provide their service and more crucially to ensure the safety of users. One way to ensure the reliability of a data transmission is to use multiple transmission technologies in combination with redundant flows. In this paper, we describe a system requiring multipath communication in the context of CCAM. To this end, we introduce a data plane-based scheduler that uses replication and integration modules to provide redundant and transparent multipath communication. We provide an analytical model for the full replication module of the system and give an overview of how and where the data-plane scheduler components can be realized.
The emerging serverless computing may meet Edge Cloud in a beneficial manner as the two offer flexibility and dynamicity in optimizing finite hardware resources. However, the lack of proper study of a joint platform leaves a gap in literature about consumption and performance of such integration. To this end, this paper identifies the key questions and proposes a methodology to answer them.
The Fifth Generation (5G) communication technology, its infrastructure and architecture, though already deployed in campus and small scale networks, is still undergoing continuous changes and research. Especially, in the light of future large scale deployments and industrial use cases, a detailed analysis of the performance and utilization with regard to latency and service times constraints is crucial. To this end, a fine granular investigation of the Network Function (NF) based core system and the duration for all the tasks performed by these services is necessary. This work presents the first steps towards analyzing the signaling traffic in 5G core networks, and introduces a tool to automatically extract sequence diagrams and service times for NF tasks from traffic traces.
Packets sent over a network can either get lost or reach their destination. Protocols like TCP try to solve this problem by resending the lost packets. However, retransmissions consume a lot of time and are cumbersome for the transmission of critical data. Multipath solutions are quite common to address this reliability issue and are available on almost every layer of the ISO/OSI model. We propose a solution based on a P4 network to duplicate packets in order to send them to their destination via multiple routes. The last network hop ensures that only a single copy of the traffic is further forwarded to its destination by adopting a concept similar to Bloom filters. Besides, if fast delivery is requested we provide a P4 prototype, which randomly forwards the packets over different transmission paths. For reproducibility, we implement our approach in a container-based network emulation system called Kathará.
Crowdsourced network measurements (CNMs) are becoming increasingly popular as they assess the performance of a mobile network from the end user's perspective on a large scale. Here, network measurements are performed directly on the end-users' devices, thus taking advantage of the real-world conditions end-users encounter. However, this type of uncontrolled measurement raises questions about its validity and reliability. The problem lies in the nature of this type of data collection. In CNMs, mobile network subscribers are involved to a large extent in the measurement process, and collect data themselves for the operator. The collection of data on user devices in arbitrary locations and at uncontrolled times requires means to ensure validity and reliability. To address this issue, our paper defines concepts and guidelines for analyzing the precision of CNMs; specifically, the number of measurements required to make valid statements. In addition to the formal definition of the aspect, we illustrate the problem and use an extensive sample data set to show possible assessment approaches. This data set consists of more than 20.4 million crowdsourced mobile measurements from across France, measured by a commercial data provider.
Deep learning enables enormous progress in many computer vision-related tasks. Artificial Intel- ligence (AI) steadily yields new state-of-the-art results in the field of detection and classification. Thereby AI performance equals or exceeds human performance. Those achievements impacted many domains, including medical applications.
One particular field of medical applications is gastroenterology. In gastroenterology, machine learning algorithms are used to assist examiners during interventions. One of the most critical concerns for gastroenterologists is the development of Colorectal Cancer (CRC), which is one of the leading causes of cancer-related deaths worldwide. Detecting polyps in screening colonoscopies is the essential procedure to prevent CRC. Thereby, the gastroenterologist uses an endoscope to screen the whole colon to find polyps during a colonoscopy. Polyps are mucosal growths that can vary in severity.
This thesis supports gastroenterologists in their examinations with automated detection and clas- sification systems for polyps. The main contribution is a real-time polyp detection system. This system is ready to be installed in any gastroenterology practice worldwide using open-source soft- ware. The system achieves state-of-the-art detection results and is currently evaluated in a clinical trial in four different centers in Germany.
The thesis presents two additional key contributions: One is a polyp detection system with ex- tended vision tested in an animal trial. Polyps often hide behind folds or in uninvestigated areas. Therefore, the polyp detection system with extended vision uses an endoscope assisted by two additional cameras to see behind those folds. If a polyp is detected, the endoscopist receives a vi- sual signal. While the detection system handles the additional two camera inputs, the endoscopist focuses on the main camera as usual.
The second one are two polyp classification models, one for the classification based on shape (Paris) and the other on surface and texture (NBI International Colorectal Endoscopic (NICE) classification). Both classifications help the endoscopist with the treatment of and the decisions about the detected polyp.
The key algorithms of the thesis achieve state-of-the-art performance. Outstandingly, the polyp detection system tested on a highly demanding video data set shows an F1 score of 90.25 % while working in real-time. The results exceed all real-time systems in the literature. Furthermore, the first preliminary results of the clinical trial of the polyp detection system suggest a high Adenoma Detection Rate (ADR). In the preliminary study, all polyps were detected by the polyp detection system, and the system achieved a high usability score of 96.3 (max 100). The Paris classification model achieved an F1 score of 89.35 % which is state-of-the-art. The NICE classification model achieved an F1 score of 81.13 %.
Furthermore, a large data set for polyp detection and classification was created during this thesis. Therefore a fast and robust annotation system called Fast Colonoscopy Annotation Tool (FastCAT) was developed. The system simplifies the annotation process for gastroenterologists. Thereby the
i
gastroenterologists only annotate key parts of the endoscopic video. Afterward, those video parts are pre-labeled by a polyp detection AI to speed up the process. After the AI has pre-labeled the frames, non-experts correct and finish the annotation. This annotation process is fast and ensures high quality. FastCAT reduces the overall workload of the gastroenterologist on average by a factor of 20 compared to an open-source state-of-art annotation tool.
Visual stimuli are frequently used to improve memory, language learning or perception, and understanding of metacognitive processes. However, in virtual reality (VR), there are few systematically and empirically derived databases. This paper proposes the first collection of virtual objects based on empirical evaluation for inter-and transcultural encounters between English- and German-speaking learners. We used explicit and implicit measurement methods to identify cultural associations and the degree of stereotypical perception for each virtual stimuli (n = 293) through two online studies, including native German and English-speaking participants. The analysis resulted in a final well-describable database of 128 objects (called InteractionSuitcase). In future applications, the objects can be used as a great interaction or conversation asset and behavioral measurement tool in social VR applications, especially in the field of foreign language education. For example, encounters can use the objects to describe their culture, or teachers can intuitively assess stereotyped attitudes of the encounters.
Heat and excessive solar radiation can produce abiotic stresses during apple maturation, resulting fruit quality. Therefore, the monitoring of temperature on fruit surface (FST) over the growing period can allow to identify thresholds, above of which several physiological disorders such as sunburn may occur in apple.
The current approaches neglect spatial variation of FST and have reduced repeatability, resulting in unreliable predictions. In this study, LiDAR laser scanning and thermal imaging were employed to detect the temperature on fruit surface by means of 3D point cloud. A process for calibrating the two sensors based on an active board target and producing a 3D thermal point cloud was suggested. After calibration, the sensor system was utilised to scan the fruit trees, while temperature values assigned in the corresponding 3D point cloud were based on the extrinsic calibration. Whereas a fruit detection algorithm was performed to segment the FST from each apple.
• The approach allows the calibration of LiDAR laser scanner with thermal camera in order to produce a 3D thermal point cloud.
• The method can be applied in apple trees for segmenting FST in 3D. Whereas the approach can be utilised to predict several physiological disorders including sunburn on fruit surface.
The strict restrictions introduced by the COVID-19 lockdowns, which started from March 2020, changed people’s daily lives and habits on many different levels. In this work, we investigate the impact of the lockdown on the communication behavior in the mobile instant messaging application WhatsApp. Our evaluations are based on a large dataset of 2577 private chat histories with 25,378,093 messages from 51,973 users. The analysis of the one-to-one and group conversations confirms that the lockdown severely altered the communication in WhatsApp chats compared to pre-pandemic time ranges. In particular, we observe short-term effects, which caused an increased message frequency in the first lockdown months and a shifted communication activity during the day in March and April 2020. Moreover, we also see long-term effects of the ongoing pandemic situation until February 2021, which indicate a change of communication behavior towards more regular messaging, as well as a persisting change in activity during the day. The results of our work show that even anonymized chat histories can tell us a lot about people’s behavior and especially behavioral changes during the COVID-19 pandemic and thus are of great relevance for behavioral researchers. Furthermore, looking at the pandemic from an Internet provider perspective, these insights can be used during the next pandemic, or if the current COVID-19 situation worsens, to adapt communication networks to the changed usage behavior early on and thus avoid network congestion.
This paper examines the relationship between time and motion perception in virtual environments. Previous work has shown that the perception of motion can affect the perception of time. We developed a virtual environment that simulates motion in a tunnel and measured its effects on the estimation of the duration of time, the speed at which perceived time passes, and the illusion of self-motion, also known as vection. When large areas of the visual field move in the same direction, vection can occur; observers often perceive this as self-motion rather than motion of the environment. To generate different levels of vection and investigate its effects on time perception, we developed an abstract procedural tunnel generator. The generator can simulate different speeds and densities of tunnel sections (visibly distinguishable sections that form the virtual tunnel), as well as the degree of embodiment of the user avatar (with or without virtual hands). We exposed participants to various tunnel simulations with different durations, speeds, and densities in a remote desktop and a virtual reality (VR) laboratory study. Time passed subjectively faster under high-speed and high-density conditions in both studies. The experience of self-motion was also stronger under high-speed and high-density conditions. Both studies revealed a significant correlation between the perceived passage of time and perceived self-motion. Subjects in the virtual reality study reported a stronger self-motion experience, a faster perceived passage of time, and shorter time estimates than subjects in the desktop study. Our results suggest that a virtual tunnel simulation can manipulate time perception in virtual reality. We will explore these results for the development of virtual reality applications for therapeutic approaches in our future work. This could be particularly useful in treating disorders like depression, autism, and schizophrenia, which are known to be associated with distortions in time perception. For example, the tunnel could be therapeutically applied by resetting patients’ time perceptions by exposing them to the tunnel under different conditions, such as increasing or decreasing perceived time.
Despite the fact that mixed-cultural backgrounds become of increasing importance in our daily life, the representation of multiple cultural backgrounds in one entity is still rare in socially interactive agents (SIAs). This paper’s contribution is twofold. First, it provides a survey of research on mixed-cultured SIAs. Second, it presents a study investigating how mixed-cultural speech (in this case, non-native accent) influences how a virtual robot is perceived in terms of personality, warmth, competence and credibility. Participants with English or German respectively as their first language watched a video of a virtual robot speaking in either standard English or German-accented English. It was expected that the German-accented speech would be rated more positively by native German participants as well as elicit the German stereotypes credibility and conscientiousness for both German and English participants. Contrary to the expectations, German participants rated the virtual robot lower in terms of competence and credibility when it spoke with a German accent, whereas English participants perceived the virtual robot with a German accent as more credible compared to the version without an accent. Both the native English and native German listeners classified the virtual robot with a German accent as significantly more neurotic than the virtual robot speaking standard English. This work shows that by solely implementing a non-native accent in a virtual robot, stereotypes are partly transferred. It also shows that the implementation of a non-native accent leads to differences in the perception of the virtual robot.
CLIP knows image aesthetics
(2022)
Most Image Aesthetic Assessment (IAA) methods use a pretrained ImageNet classification model as a base to fine-tune. We hypothesize that content classification is not an optimal pretraining task for IAA, since the task discourages the extraction of features that are useful for IAA, e.g., composition, lighting, or style. On the other hand, we argue that the Contrastive Language-Image Pretraining (CLIP) model is a better base for IAA models, since it has been trained using natural language supervision. Due to the rich nature of language, CLIP needs to learn a broad range of image features that correlate with sentences describing the image content, composition, environments, and even subjective feelings about the image. While it has been shown that CLIP extracts features useful for content classification tasks, its suitability for tasks that require the extraction of style-based features like IAA has not yet been shown. We test our hypothesis by conducting a three-step study, investigating the usefulness of features extracted by CLIP compared to features obtained from the last layer of a comparable ImageNet classification model. In each step, we get more computationally expensive. First, we engineer natural language prompts that let CLIP assess an image's aesthetic without adjusting any weights in the model. To overcome the challenge that CLIP's prompting only is applicable to classification tasks, we propose a simple but effective strategy to convert multiple prompts to a continuous scalar as required when predicting an image's mean aesthetic score. Second, we train a linear regression on the AVA dataset using image features obtained by CLIP's image encoder. The resulting model outperforms a linear regression trained on features from an ImageNet classification model. It also shows competitive performance with fully fine-tuned networks based on ImageNet, while only training a single layer. Finally, by fine-tuning CLIP's image encoder on the AVA dataset, we show that CLIP only needs a fraction of training epochs to converge, while also performing better than a fine-tuned ImageNet model. Overall, our experiments suggest that CLIP is better suited as a base model for IAA methods than ImageNet pretrained networks.
This article presents a novel method for controlling a virtual audience system (VAS) in Virtual Reality (VR) application, called STAGE, which has been originally designed for supervised public speaking training in university seminars dedicated to the preparation and delivery of scientific talks. We are interested in creating pedagogical narratives: narratives encompass affective phenomenon and rather than organizing events changing the course of a training scenario, pedagogical plans using our system focus on organizing the affects it arouses for the trainees. Efficiently controlling a virtual audience towards a specific training objective while evaluating the speaker’s performance presents a challenge for a seminar instructor: the high level of cognitive and physical demands required to be able to control the virtual audience, whilst evaluating speaker’s performance, adjusting and allowing it to quickly react to the user’s behaviors and interactions. It is indeed a critical limitation of a number of existing systems that they rely on a Wizard of Oz approach, where the tutor drives the audience in reaction to the user’s performance. We address this problem by integrating with a VAS a high-level control component for tutors, which allows using predefined audience behavior rules, defining custom ones, as well as intervening during run-time for finer control of the unfolding of the pedagogical plan. At its core, this component offers a tool to program, select, modify and monitor interactive training narratives using a high-level representation. The STAGE offers the following features: i) a high-level API to program pedagogical narratives focusing on a specific public speaking situation and training objectives, ii) an interactive visualization interface iii) computation and visualization of user metrics, iv) a semi-autonomous virtual audience composed of virtual spectators with automatic reactions to the speaker and surrounding spectators while following the pedagogical plan V) and the possibility for the instructor to embody a virtual spectator to ask questions or guide the speaker from within the Virtual Environment. We present here the design, and implementation of the tutoring system and its integration in STAGE, and discuss its reception by end-users.
Virtual environments (VEs) can evoke and support emotions, as experienced when playing emotionally arousing games. We theoretically approach the design of fear and joy evoking VEs based on a literature review of empirical studies on virtual and real environments as well as video games’ reviews and content analyses. We define the design space and identify central design elements that evoke specific positive and negative emotions. Based on that, we derive and present guidelines for emotion-inducing VE design with respect to design themes, colors and textures, and lighting configurations. To validate our guidelines in two user studies, we 1) expose participants to 360° videos of VEs designed following the individual guidelines and 2) immerse them in a neutral, positive and negative emotion-inducing VEs combining all respective guidelines in Virtual Reality. The results support our theoretically derived guidelines by revealing significant differences in terms of fear and joy induction.
The rapid development of green and sustainable materials opens up new possibilities in the field of applied research. Such materials include nanocellulose composites that can integrate many components into composites and provide a good chassis for smart devices. In our study, we evaluate four approaches for turning a nanocellulose composite into an information storage or processing device: 1) nanocellulose can be a suitable carrier material and protect information stored in DNA. 2) Nucleotide-processing enzymes (polymerase and exonuclease) can be controlled by light after fusing them with light-gating domains; nucleotide substrate specificity can be changed by mutation or pH change (read-in and read-out of the information). 3) Semiconductors and electronic capabilities can be achieved: we show that nanocellulose is rendered electronic by iodine treatment replacing silicon including microstructures. Nanocellulose semiconductor properties are measured, and the resulting potential including single-electron transistors (SET) and their properties are modeled. Electric current can also be transported by DNA through G-quadruplex DNA molecules; these as well as classical silicon semiconductors can easily be integrated into the nanocellulose composite. 4) To elaborate upon miniaturization and integration for a smart nanocellulose chip device, we demonstrate pH-sensitive dyes in nanocellulose, nanopore creation, and kinase micropatterning on bacterial membranes as well as digital PCR micro-wells. Future application potential includes nano-3D printing and fast molecular processors (e.g., SETs) integrated with DNA storage and conventional electronics. This would also lead to environment-friendly nanocellulose chips for information processing as well as smart nanocellulose composites for biomedical applications and nano-factories.
A key feature for Internet of Things (IoT) is to control what content is available to each user. To handle this access management, encryption schemes can be used. Due to the diverse usage of encryption schemes, there are various realizations of 1-to-1, 1-to-n, and n-to-n schemes in the literature. This multitude of encryption methods with a wide variety of properties presents developers with the challenge of selecting the optimal method for a particular use case, which is further complicated by the fact that there is no overview of existing encryption schemes. To fill this gap, we envision a cryptography encyclopedia providing such an overview of existing encryption schemes. In this survey paper, we take a first step towards such an encyclopedia by creating a sub-encyclopedia for secure group communication (SGC) schemes, which belong to the n-to-n category. We extensively surveyed the state-of-the-art and classified 47 different schemes. More precisely, we provide (i) a comprehensive overview of the relevant security features, (ii) a set of relevant performance metrics, (iii) a classification for secure group communication schemes, and (iv) workflow descriptions of the 47 schemes. Moreover, we perform a detailed performance and security evaluation of the 47 secure group communication schemes. Based on this evaluation, we create a guideline for the selection of secure group communication schemes.
Around 4.9 billion Internet users worldwide watch billions of hours of online video every day. As a result, streaming is by far the predominant type of traffic in communication networks. According to Google statistics, three out of five video views come from mobile devices. Thus, in view of the continuous technological advances in end devices and increasing mobile use, datasets for mobile streaming are indispensable in research but only sparsely dealt with in literature so far. With this public dataset, we provide 1,081 hours of time-synchronous video measurements at network, transport, and application layer with the native YouTube streaming client on mobile devices. The dataset includes 80 network scenarios with 171 different individual bandwidth settings measured in 5,181 runs with limited bandwidth, 1,939 runs with emulated 3 G/4 G traces, and 4,022 runs with pre-defined bandwidth changes. This corresponds to 332 GB video payload. We present the most relevant quality indicators for scientific use, i.e., initial playback delay, streaming video quality, adaptive video quality changes, video rebuffering events, and streaming phases.
Background
Machine learning, especially deep learning, is becoming more and more relevant in research and development in the medical domain. For all the supervised deep learning applications, data is the most critical factor in securing successful implementation and sustaining the progress of the machine learning model. Especially gastroenterological data, which often involves endoscopic videos, are cumbersome to annotate. Domain experts are needed to interpret and annotate the videos. To support those domain experts, we generated a framework. With this framework, instead of annotating every frame in the video sequence, experts are just performing key annotations at the beginning and the end of sequences with pathologies, e.g., visible polyps. Subsequently, non-expert annotators supported by machine learning add the missing annotations for the frames in-between.
Methods
In our framework, an expert reviews the video and annotates a few video frames to verify the object’s annotations for the non-expert. In a second step, a non-expert has visual confirmation of the given object and can annotate all following and preceding frames with AI assistance. After the expert has finished, relevant frames will be selected and passed on to an AI model. This information allows the AI model to detect and mark the desired object on all following and preceding frames with an annotation. Therefore, the non-expert can adjust and modify the AI predictions and export the results, which can then be used to train the AI model.
Results
Using this framework, we were able to reduce workload of domain experts on average by a factor of 20 on our data. This is primarily due to the structure of the framework, which is designed to minimize the workload of the domain expert. Pairing this framework with a state-of-the-art semi-automated AI model enhances the annotation speed further. Through a prospective study with 10 participants, we show that semi-automated annotation using our tool doubles the annotation speed of non-expert annotators compared to a well-known state-of-the-art annotation tool.
Conclusion
In summary, we introduce a framework for fast expert annotation for gastroenterologists, which reduces the workload of the domain expert considerably while maintaining a very high annotation quality. The framework incorporates a semi-automated annotation system utilizing trained object detection models. The software and framework are open-source.
Towards LoRaWAN without data loss: studying the performance of different channel access approaches
(2022)
The Long Range Wide Area Network (LoRaWAN) is one of the fastest growing Internet of Things (IoT) access protocols. It operates in the license free 868 MHz band and gives everyone the possibility to create their own small sensor networks. The drawback of this technology is often unscheduled or random channel access, which leads to message collisions and potential data loss. For that reason, recent literature studies alternative approaches for LoRaWAN channel access. In this work, state-of-the-art random channel access is compared with alternative approaches from the literature by means of collision probability. Furthermore, a time scheduled channel access methodology is presented to completely avoid collisions in LoRaWAN. For this approach, an exhaustive simulation study was conducted and the performance was evaluated with random access cross-traffic. In a general theoretical analysis the limits of the time scheduled approach are discussed to comply with duty cycle regulations in LoRaWAN.
The landscape of today’s programming languages is manifold. With the diversity of applications, the difficulty of adequately addressing and specifying the used programs increases. This often leads to newly designed and implemented domain-specific languages. They enable domain experts to express knowledge in their preferred format, resulting in more readable and concise programs. Due to its flexible and declarative syntax without reserved keywords, the logic programming language Prolog is particularly suitable for defining and embedding domain-specific languages.
This thesis addresses the questions and challenges that arise when integrating domain-specific languages into Prolog. We compare the two approaches to define them either externally or internally, and provide assisting tools for each. The grammar of a formal language is usually defined in the extended Backus–Naur form. In this work, we handle this formalism as a domain-specific language in Prolog, and define term expansions that allow to translate it into equivalent definite clause grammars. We present the package library(dcg4pt) for SWI-Prolog, which enriches them by an additional argument to automatically process the term’s corresponding parse tree. To simplify the work with definite clause grammars, we visualise their application by a web-based tracer.
The external integration of domain-specific languages requires the programmer to keep the grammar, parser, and interpreter in sync. In many cases, domain-specific languages can instead be directly embedded into Prolog by providing appropriate operator definitions. In addition, we propose syntactic extensions for Prolog to expand its expressiveness, for instance to state logic formulas with their connectives verbatim. This allows to use all tools that were originally written for Prolog, for instance code linters and editors with syntax highlighting. We present the package library(plammar), a standard-compliant parser for Prolog source code, written in Prolog. It is able to automatically infer from example sentences the required operator definitions with their classes and precedences as well as the required Prolog language extensions. As a result, we can automatically answer the question: Is it possible to model these example sentences as valid Prolog clauses, and how?
We discuss and apply the two approaches to internal and external integrations for several domain-specific languages, namely the extended Backus–Naur form, GraphQL, XPath, and a controlled natural language to represent expert rules in if-then form. The created toolchain with library(dcg4pt) and library(plammar) yields new application opportunities for static Prolog source code analysis, which we also present.
Presence is often considered the most important quale describing the subjective feeling of being in a computer-generated and/or computer-mediated virtual environment. The identification and separation of orthogonal presence components, i.e., the place illusion and the plausibility illusion, has been an accepted theoretical model describing Virtual Reality (VR) experiences for some time. This perspective article challenges this presence-oriented VR theory. First, we argue that a place illusion cannot be the major construct to describe the much wider scope of virtual, augmented, and mixed reality (VR, AR, MR: or XR for short). Second, we argue that there is no plausibility illusion but merely plausibility, and we derive the place illusion caused by the congruent and plausible generation of spatial cues and similarly for all the current model’s so-defined illusions. Finally, we propose congruence and plausibility to become the central essential conditions in a novel theoretical model describing XR experiences and effects.