004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (183)
Year of publication
Document Type
- Journal article (74)
- Doctoral Thesis (63)
- Working Paper (37)
- Conference Proceeding (7)
- Report (2)
Language
- English (183) (remove)
Keywords
- Datennetz (14)
- Leistungsbewertung (13)
- virtual reality (12)
- Robotik (8)
- Mobiler Roboter (6)
- Autonomer Roboter (5)
- Komplexitätstheorie (5)
- Optimierung (5)
- P4 (5)
- Theoretische Informatik (5)
Institute
- Institut für Informatik (183) (remove)
Schriftenreihe
Sonstige beteiligte Institutionen
The DAEDALUS mission concept aims at exploring and characterising the entrance and initial part of Lunar lava tubes within a compact, tightly integrated spherical robotic device, with a complementary payload set and autonomous capabilities.
The mission concept addresses specifically the identification and characterisation of potential resources for future ESA exploration, the local environment of the subsurface and its geologic and compositional structure.
A sphere is ideally suited to protect sensors and scientific equipment in rough, uneven environments.
It will house laser scanners, cameras and ancillary payloads.
The sphere will be lowered into the skylight and will explore the entrance shaft, associated caverns and conduits. Lidar (light detection and ranging) systems produce 3D models with high spatial accuracy independent of lighting conditions and visible features.
Hence this will be the primary exploration toolset within the sphere.
The additional payload that can be accommodated in the robotic sphere consists of camera systems with panoramic lenses and scanners such as multi-wavelength or single-photon scanners.
A moving mass will trigger movements.
The tether for lowering the sphere will be used for data communication and powering the equipment during the descending phase.
Furthermore, the connector tether-sphere will host a WIFI access point, such that data of the conduit can be transferred to the surface relay station. During the exploration phase, the robot will be disconnected from the cable, and will use wireless communication.
Emergency autonomy software will ensure that in case of loss of communication, the robot will continue the nominal mission.
Mapping and localization of mobile robots in an unknown environment are essential for most high-level operations like autonomous navigation or exploration. This paper presents a novel approach for combining estimated trajectories, namely curvefusion. The robot used in the experiments is equipped with a horizontally mounted 2D profiler, a constantly spinning 3D laser scanner and a GPS module. The proposed algorithm first combines trajectories from different sensors to optimize poses of the planar three degrees of freedom (DoF) trajectory, which is then fed into continuous-time simultaneous localization and mapping (SLAM) to further improve the trajectory. While state-of-the-art multi-sensor fusion methods mainly focus on probabilistic methods, our approach instead adopts a deformation-based method to optimize poses. To this end, a similarity metric for curved shapes is introduced into the robotics community to fuse the estimated trajectories. Additionally, a shape-based point correspondence estimation method is applied to the multi-sensor time calibration. Experiments show that the proposed fusion method can achieve relatively better accuracy, even if the error of the trajectory before fusion is large, which demonstrates that our method can still maintain a certain degree of accuracy in an environment where typical pose estimation methods have poor performance. In addition, the proposed time-calibration method also achieves high accuracy in estimating point correspondences.
In recent years several community testbeds as well as participatory sensing platforms have successfully established themselves to provide open data to everyone interested. Each of them with a specific goal in mind, ranging from collecting radio coverage data up to environmental and radiation data. Such data can be used by the community in their decision making, whether to subscribe to a specific mobile phone service that provides good coverage in an area or in finding a sunny and warm region for the summer holidays.
However, the existing platforms are usually limiting themselves to directly measurable network QoS. If such a crowdsourced data set provides more in-depth derived measures, this would enable an even better decision making. A community-driven crowdsensing platform that derives spatial application-layer user experience from resource-friendly bandwidth estimates would be such a case, video streaming services come to mind as a prime example. In this paper we present a concept for such a system based on an initial prototype that eases the collection of data necessary to determine mobile-specific QoE at large scale. In addition we reason why the simple quality metric proposed here can hold its own.
In many cases, problems, data, or information can be modeled as graphs. Graphs can be used as a tool for modeling in any case where connections between distinguishable objects occur. Any graph consists of a set of objects, called vertices, and a set of connections, called edges, such that any edge connects a pair of vertices. For example, a social network can be modeled by a graph by
transforming the users of the network into vertices and friendship relations between users into edges. Also physical networks like computer networks or transportation networks, for example, the metro network of a city, can be seen as graphs.
For making graphs and, thereby, the data that is modeled, well-understandable for users, we need a visualization. Graph drawing deals with algorithms for visualizing graphs. In this thesis, especially the use of crossings and curves is investigated for graph drawing problems under additional constraints. The constraints that occur in the problems investigated in this thesis especially restrict the positions of (a part of) the vertices; this is done either as a hard constraint or as an optimization criterion.
Climate models are the tool of choice for scientists researching climate change. Like all models they suffer from errors, particularly systematic and location-specific representation errors. One way to reduce these errors is model output statistics (MOS) where the model output is fitted to observational data with machine learning. In this work, we assess the use of convolutional Deep Learning climate MOS approaches and present the ConvMOS architecture which is specifically designed based on the observation that there are systematic and location-specific errors in the precipitation estimates of climate models. We apply ConvMOS models to the simulated precipitation of the regional climate model REMO, showing that a combination of per-location model parameters for reducing location-specific errors and global model parameters for reducing systematic errors is indeed beneficial for MOS performance. We find that ConvMOS models can reduce errors considerably and perform significantly better than three commonly used MOS approaches and plain ResNet and U-Net models in most cases. Our results show that non-linear MOS models underestimate the number of extreme precipitation events, which we alleviate by training models specialized towards extreme precipitation events with the imbalanced regression method DenseLoss. While we consider climate MOS, we argue that aspects of ConvMOS may also be beneficial in other domains with geospatial data, such as air pollution modeling or weather forecasts.
This article presents a novel method for controlling a virtual audience system (VAS) in Virtual Reality (VR) application, called STAGE, which has been originally designed for supervised public speaking training in university seminars dedicated to the preparation and delivery of scientific talks. We are interested in creating pedagogical narratives: narratives encompass affective phenomenon and rather than organizing events changing the course of a training scenario, pedagogical plans using our system focus on organizing the affects it arouses for the trainees. Efficiently controlling a virtual audience towards a specific training objective while evaluating the speaker’s performance presents a challenge for a seminar instructor: the high level of cognitive and physical demands required to be able to control the virtual audience, whilst evaluating speaker’s performance, adjusting and allowing it to quickly react to the user’s behaviors and interactions. It is indeed a critical limitation of a number of existing systems that they rely on a Wizard of Oz approach, where the tutor drives the audience in reaction to the user’s performance. We address this problem by integrating with a VAS a high-level control component for tutors, which allows using predefined audience behavior rules, defining custom ones, as well as intervening during run-time for finer control of the unfolding of the pedagogical plan. At its core, this component offers a tool to program, select, modify and monitor interactive training narratives using a high-level representation. The STAGE offers the following features: i) a high-level API to program pedagogical narratives focusing on a specific public speaking situation and training objectives, ii) an interactive visualization interface iii) computation and visualization of user metrics, iv) a semi-autonomous virtual audience composed of virtual spectators with automatic reactions to the speaker and surrounding spectators while following the pedagogical plan V) and the possibility for the instructor to embody a virtual spectator to ask questions or guide the speaker from within the Virtual Environment. We present here the design, and implementation of the tutoring system and its integration in STAGE, and discuss its reception by end-users.
Constraining graph layouts - that is, restricting the placement of vertices and the routing of edges to obey certain constraints - is common practice in graph drawing.
In this book, we discuss algorithmic results on two different restriction types:
placing vertices on the outer face and on the integer grid.
For the first type, we look into the outer k-planar and outer k-quasi-planar graphs, as well as giving a linear-time algorithm to recognize full and closed outer k-planar graphs Monadic Second-order Logic.
For the second type, we consider the problem of transferring a given planar drawing onto the integer grid while perserving the original drawings topology;
we also generalize a variant of Cauchy's rigidity theorem for orthogonal polyhedra of genus 0 to those of arbitrary genus.
Presence is often considered the most important quale describing the subjective feeling of being in a computer-generated and/or computer-mediated virtual environment. The identification and separation of orthogonal presence components, i.e., the place illusion and the plausibility illusion, has been an accepted theoretical model describing Virtual Reality (VR) experiences for some time. This perspective article challenges this presence-oriented VR theory. First, we argue that a place illusion cannot be the major construct to describe the much wider scope of virtual, augmented, and mixed reality (VR, AR, MR: or XR for short). Second, we argue that there is no plausibility illusion but merely plausibility, and we derive the place illusion caused by the congruent and plausible generation of spatial cues and similarly for all the current model’s so-defined illusions. Finally, we propose congruence and plausibility to become the central essential conditions in a novel theoretical model describing XR experiences and effects.
Complexity and Partitions
(2001)
Computational complexity theory usually investigates the complexity of sets, i.e., the complexity of partitions into two parts. But often it is more appropriate to represent natural problems by partitions into more than two parts. A particularly interesting class of such problems consists of classification problems for relations. For instance, a binary relation R typically defines a partitioning of the set of all pairs (x,y) into four parts, classifiable according to the cases where R(x,y) and R(y,x) hold, only R(x,y) or only R(y,x) holds or even neither R(x,y) nor R(y,x) is true. By means of concrete classification problems such as Graph Embedding or Entailment (for propositional logic), this thesis systematically develops tools, in shape of the boolean hierarchy of NP-partitions and its refinements, for the qualitative analysis of the complexity of partitions generated by NP-relations. The Boolean hierarchy of NP-partitions is introduced as a generalization of the well-known and well-studied Boolean hierarchy (of sets) over NP. Whereas the latter hierarchy has a very simple structure, the situation is much more complicated for the case of partitions into at least three parts. To get an idea of this hierarchy, alternative descriptions of the partition classes are given in terms of finite, labeled lattices. Based on these characterizations the Embedding Conjecture is established providing the complete information on the structure of the hierarchy. This conjecture is supported by several results. A natural extension of the Boolean hierarchy of NP-partitions emerges from the lattice-characterization of its classes by considering partition classes generated by finite, labeled posets. It turns out that all significant ideas translate from the case of lattices. The induced refined Boolean hierarchy of NP-partitions enables us more accuratly capturing the complexity of certain relations (such as Graph Embedding) and a description of projectively closed partition classes.
We consider competitive location problems where two competing providers place their facilities sequentially and users can decide between the competitors. We assume that both competitors act non-cooperatively and aim at maximizing their own benefits. We investigate the complexity and approximability of such problems on graphs, in particular on simple graph classes such as trees and paths. We also develop fast algorithms for single competitive location problems where each provider places a single facilty. Voting location, in contrast, aims at identifying locations that meet social criteria. The provider wants to satisfy the users (customers) of the facility to be opened. In general, there is no location that is favored by all users. Therefore, a satisfactory compromise has to be found. To this end, criteria arising from voting theory are considered. The solution of the location problem is understood as the winner of a virtual election among the users of the facilities, in which the potential locations play the role of the candidates and the users represent the voters. Competitive and voting location problems turn out to be closely related.
Scalability is often mentioned in literature, but a stringent definition is missing. In particular, there is no general scalability assessment which clearly indicates whether a system scales or not or whether a system scales better than another. The key contribution of this article is the definition of a scalability index (SI) which quantifies if a system scales in comparison to another system, a hypothetical system, e.g., linear system, or the theoretically optimal system. The suggested SI generalizes different metrics from literature, which are specialized cases of our SI. The primary target of our scalability framework is, however, benchmarking of two systems, which does not require any reference system. The SI is demonstrated and evaluated for different use cases, that are (1) the performance of an IoT load balancer depending on the system load, (2) the availability of a communication system depending on the size and structure of the network, (3) scalability comparison of different location selection mechanisms in fog computing with respect to delays and energy consumption; (4) comparison of time-sensitive networking (TSN) mechanisms in terms of efficiency and utilization. Finally, we discuss how to use and how not to use the SI and give recommendations and guidelines in practice. To the best of our knowledge, this is the first work which provides a general SI for the comparison and benchmarking of systems, which is the primary target of our scalability analysis.
Today’s advanced Internet-of-Things applications raise technical challenges on cloud, edge, and fog computing. The design of an efficient, virtualized, context-aware, self-configuring orchestration system of a fog computing system constitutes a major development effort within this very innovative area of research. In this paper we describe the architecture and relevant implementation aspects of a cloudless resource monitoring system interworking with an SDN/NFV infrastructure. It realizes the basic monitoring component of the fundamental MAPE-K principles employed in autonomic computing. Here we present the hierarchical layering and functionality within the underlying fog nodes to generate a working prototype of an intelligent, self-managed orchestrator for advanced IoT applications and services. The latter system has the capability to monitor automatically various performance aspects of the resource allocation among multiple hosts of a fog computing system interconnected by SDN.
CLIP knows image aesthetics
(2022)
Most Image Aesthetic Assessment (IAA) methods use a pretrained ImageNet classification model as a base to fine-tune. We hypothesize that content classification is not an optimal pretraining task for IAA, since the task discourages the extraction of features that are useful for IAA, e.g., composition, lighting, or style. On the other hand, we argue that the Contrastive Language-Image Pretraining (CLIP) model is a better base for IAA models, since it has been trained using natural language supervision. Due to the rich nature of language, CLIP needs to learn a broad range of image features that correlate with sentences describing the image content, composition, environments, and even subjective feelings about the image. While it has been shown that CLIP extracts features useful for content classification tasks, its suitability for tasks that require the extraction of style-based features like IAA has not yet been shown. We test our hypothesis by conducting a three-step study, investigating the usefulness of features extracted by CLIP compared to features obtained from the last layer of a comparable ImageNet classification model. In each step, we get more computationally expensive. First, we engineer natural language prompts that let CLIP assess an image's aesthetic without adjusting any weights in the model. To overcome the challenge that CLIP's prompting only is applicable to classification tasks, we propose a simple but effective strategy to convert multiple prompts to a continuous scalar as required when predicting an image's mean aesthetic score. Second, we train a linear regression on the AVA dataset using image features obtained by CLIP's image encoder. The resulting model outperforms a linear regression trained on features from an ImageNet classification model. It also shows competitive performance with fully fine-tuned networks based on ImageNet, while only training a single layer. Finally, by fine-tuning CLIP's image encoder on the AVA dataset, we show that CLIP only needs a fraction of training epochs to converge, while also performing better than a fine-tuned ImageNet model. Overall, our experiments suggest that CLIP is better suited as a base model for IAA methods than ImageNet pretrained networks.
The emerging serverless computing may meet Edge Cloud in a beneficial manner as the two offer flexibility and dynamicity in optimizing finite hardware resources. However, the lack of proper study of a joint platform leaves a gap in literature about consumption and performance of such integration. To this end, this paper identifies the key questions and proposes a methodology to answer them.
Mindfulness is considered an important factor of an individual's subjective well-being. Consequently, Human-Computer Interaction (HCI) has investigated approaches that strengthen mindfulness, i.e., by inventing multimedia technologies to support mindfulness meditation. These approaches often use smartphones, tablets, or consumer-grade desktop systems to allow everyday usage in users' private lives or in the scope of organized therapies. Virtual, Augmented, and Mixed Reality (VR, AR, MR; in short: XR) significantly extend the design space for such approaches. XR covers a wide range of potential sensory stimulation, perceptive and cognitive manipulations, content presentation, interaction, and agency. These facilities are linked to typical XR-specific perceptions that are conceptually closely related to mindfulness research, such as (virtual) presence and (virtual) embodiment. However, a successful exploitation of XR that strengthens mindfulness requires a systematic analysis of the potential interrelation and influencing mechanisms between XR technology, its properties, factors, and phenomena and existing models and theories of the construct of mindfulness. This article reports such a systematic analysis of XR-related research from HCI and life sciences to determine the extent to which existing research frameworks on HCI and mindfulness can be applied to XR technologies, the potential of XR technologies to support mindfulness, and open research gaps. Fifty papers of ACM Digital Library and National Institutes of Health's National Library of Medicine (PubMed) with and without empirical efficacy evaluation were included in our analysis. The results reveal that at the current time, empirical research on XR-based mindfulness support mainly focuses on therapy and therapeutic outcomes. Furthermore, most of the currently investigated XR-supported mindfulness interactions are limited to vocally guided meditations within nature-inspired virtual environments. While an analysis of empirical research on those systems did not reveal differences in mindfulness compared to non-mediated mindfulness practices, various design proposals illustrate that XR has the potential to provide interactive and body-based innovations for mindfulness practice. We propose a structured approach for future work to specify and further explore the potential of XR as mindfulness-support. The resulting framework provides design guidelines for XR-based mindfulness support based on the elements and psychological mechanisms of XR interactions.
This article presents an immersive virtual reality (VR) system for training classroom management skills, with a specific focus on learning to manage disruptive student behavior in face-to-face, one-to-many teaching scenarios. The core of the system is a real-time 3D virtual simulation of a classroom populated by twenty-four semi-autonomous virtual students. The system has been designed as a companion tool for classroom management seminars in a syllabus for primary and secondary school teachers. This will allow lecturers to link theory with practice using the medium of VR. The system is therefore designed for two users: a trainee teacher and an instructor supervising the training session. The teacher is immersed in a real-time 3D simulation of a classroom by means of a head-mounted display and headphone. The instructor operates a graphical desktop console, which renders a view of the class and the teacher whose avatar movements are captured by a marker less tracking system. This console includes a 2D graphics menu with convenient behavior and feedback control mechanisms to provide human-guided training sessions. The system is built using low-cost consumer hardware and software. Its architecture and technical design are described in detail. A first evaluation confirms its conformance to critical usability requirements (i.e., safety and comfort, believability, simplicity, acceptability, extensibility, affordability, and mobility). Our initial results are promising and constitute the necessary first step toward a possible investigation of the efficiency and effectiveness of such a system in terms of learning outcomes and experience.
This paper describes the estimation of the body weight of a person in front of an RGB-D camera. A survey of different methods for body weight estimation based on depth sensors is given. First, an estimation of people standing in front of a camera is presented. Second, an approach based on a stream of depth images is used to obtain the body weight of a person walking towards a sensor. The algorithm first extracts features from a point cloud and forwards them to an artificial neural network (ANN) to obtain an estimation of body weight. Besides the algorithm for the estimation, this paper further presents an open-access dataset based on measurements from a trauma room in a hospital as well as data from visitors of a public event. In total, the dataset contains 439 measurements. The article illustrates the efficiency of the approach with experiments with persons lying down in a hospital, standing persons, and walking persons. Applicable scenarios for the presented algorithm are body weight-related dosing of emergency patients.
The development of ICT infrastructures has facilitated the emergence of new paradigms for looking at society and the environment over the last few years. Participatory environmental sensing, i.e. directly involving citizens in environmental monitoring, is one example, which is hoped to encourage learning and enhance awareness of environmental issues. In this paper, an analysis of the behaviour of individuals involved in noise sensing is presented. Citizens have been involved in noise measuring activities through the WideNoise smartphone application. This application has been designed to record both objective (noise samples) and subjective (opinions, feelings) data. The application has been open to be used freely by anyone and has been widely employed worldwide. In addition, several test cases have been organised in European countries. Based on the information submitted by users, an analysis of emerging awareness and learning is performed. The data show that changes in the way the environment is perceived after repeated usage of the application do appear. Specifically, users learn how to recognise different noise levels they are exposed to. Additionally, the subjective data collected indicate an increased user involvement in time and a categorisation effect between pleasant and less pleasant environments.
Background
Colorectal cancer is a leading cause of cancer-related deaths worldwide. The best method to prevent CRC is a colonoscopy. However, not all colon polyps have the risk of becoming cancerous. Therefore, polyps are classified using different classification systems. After the classification, further treatment and procedures are based on the classification of the polyp. Nevertheless, classification is not easy. Therefore, we suggest two novel automated classifications system assisting gastroenterologists in classifying polyps based on the NICE and Paris classification.
Methods
We build two classification systems. One is classifying polyps based on their shape (Paris). The other classifies polyps based on their texture and surface patterns (NICE). A two-step process for the Paris classification is introduced: First, detecting and cropping the polyp on the image, and secondly, classifying the polyp based on the cropped area with a transformer network. For the NICE classification, we design a few-shot learning algorithm based on the Deep Metric Learning approach. The algorithm creates an embedding space for polyps, which allows classification from a few examples to account for the data scarcity of NICE annotated images in our database.
Results
For the Paris classification, we achieve an accuracy of 89.35 %, surpassing all papers in the literature and establishing a new state-of-the-art and baseline accuracy for other publications on a public data set. For the NICE classification, we achieve a competitive accuracy of 81.13 % and demonstrate thereby the viability of the few-shot learning paradigm in polyp classification in data-scarce environments. Additionally, we show different ablations of the algorithms. Finally, we further elaborate on the explainability of the system by showing heat maps of the neural network explaining neural activations.
Conclusion
Overall we introduce two polyp classification systems to assist gastroenterologists. We achieve state-of-the-art performance in the Paris classification and demonstrate the viability of the few-shot learning paradigm in the NICE classification, addressing the prevalent data scarcity issues faced in medical machine learning.