Institut für Informatik
Refine
Has Fulltext
- yes (142)
Is part of the Bibliography
- yes (142)
Year of publication
Document Type
- Journal article (142) (remove)
Language
- English (142) (remove)
Keywords
- virtual reality (16)
- machine learning (9)
- deep learning (6)
- immersion (4)
- Autonomous UAV (3)
- Quadrocopter (3)
- Quadrotor (3)
- UAV (3)
- artificial intelligence (3)
- augmented reality (3)
Institute
- Institut für Informatik (142)
- Institut Mensch - Computer - Medien (10)
- Medizinische Klinik und Poliklinik I (6)
- Medizinische Klinik und Poliklinik II (6)
- Institut für Sportwissenschaft (5)
- Deutsches Zentrum für Herzinsuffizienz (DZHI) (3)
- Institut für Klinische Epidemiologie und Biometrie (3)
- Theodor-Boveri-Institut für Biowissenschaften (3)
- Institut für Geographie und Geologie (2)
- Institut für Psychologie (2)
Sonstige beteiligte Institutionen
Semantic Fusion for Natural Multimodal Interfaces using Concurrent Augmented Transition Networks
(2018)
Semantic fusion is a central requirement of many multimodal interfaces. Procedural methods like finite-state transducers and augmented transition networks have proven to be beneficial to implement semantic fusion. They are compliant with rapid development cycles that are common for the development of user interfaces, in contrast to machine-learning approaches that require time-costly training and optimization. We identify seven fundamental requirements for the implementation of semantic fusion: Action derivation, continuous feedback, context-sensitivity, temporal relation support, access to the interaction context, as well as the support of chronologically unsorted and probabilistic input. A subsequent analysis reveals, however, that there is currently no solution for fulfilling the latter two requirements. As the main contribution of this article, we thus present the Concurrent Cursor concept to compensate these shortcomings. In addition, we showcase a reference implementation, the Concurrent Augmented Transition Network (cATN), that validates the concept’s feasibility in a series of proof of concept demonstrations as well as through a comparative benchmark. The cATN fulfills all identified requirements and fills the lack amongst previous solutions. It supports the rapid prototyping of multimodal interfaces by means of five concrete traits: Its declarative nature, the recursiveness of the underlying transition network, the network abstraction constructs of its description language, the utilized semantic queries, and an abstraction layer for lexical information. Our reference implementation was and is used in various student projects, theses, as well as master-level courses. It is openly available and showcases that non-experts can effectively implement multimodal interfaces, even for non-trivial applications in mixed and virtual reality.
Descriptors play an important role in point cloud registration. The current state-of-the-art resorts to the high regression capability of deep learning. However, recent deep learning-based descriptors require different levels of annotation and selection of patches, which make the model hard to migrate to new scenarios. In this work, we learn local registration descriptors for point clouds
in a self-supervised manner. In each iteration of the training, the input of the network is merely one unlabeled point cloud. Thus, the whole training requires no manual annotation and manual selection of patches. In addition, we propose to involve keypoint sampling into the pipeline, which further improves the performance of our model. Our experiments demonstrate the capability of our self-supervised local descriptor to achieve even better performance than the supervised model, while being easier to train and requiring no data labeling.
In this paper we study connectivity augmentation problems. Given a connected graph G with some desirable property, we want to make G 2-vertex connected (or 2-edge connected) by adding edges such that the resulting graph keeps the property. The aim is to add as few edges as possible. The property that we consider is planarity, both in an abstract graph-theoretic and in a geometric setting, where vertices correspond to points in the plane and edges to straight-line segments.
We show that it is NP-hard to nd a minimum-cardinality augmentation that makes a planar graph 2-edge connected. For making a planar graph 2-vertex connected this was known. We further show that both problems are hard in the geometric setting, even when restricted to trees. The problems remain hard for higher degrees of connectivity. On the other hand we give polynomial-time algorithms for the special case of convex geometric graphs.
We also study the following related problem. Given a planar (plane geometric) graph G, two vertices s and t of G, and an integer c, how many edges have to be added to G such that G is still planar (plane geometric) and contains c edge- (or vertex-) disjoint s{t paths? For the planar case we give a linear-time algorithm for c = 2. For the plane geometric case we give optimal worst-case bounds for c = 2; for c = 3 we characterize the cases that have a solution.
Virtual reality applications employing avatar embodiment typically use virtual mirrors to allow users to perceive their digital selves not only from a first-person but also from a holistic third-person perspective. However, due to distance-related biases such as the distance compression effect or a reduced relative rendering resolution, the self-observation distance (SOD) between the user and the virtual mirror might influence how users perceive their embodied avatar. Our article systematically investigates the effects of a short (1 m), middle (2.5 m), and far (4 m) SOD between users and mirror on the perception of their personalized and self-embodied avatars. The avatars were photorealistic reconstructed using state-of-the-art photogrammetric methods. Thirty participants repeatedly faced their real-time animated self-embodied avatars in each of the three SOD conditions, where they were repeatedly altered in their body weight, and participants rated the 1) sense of embodiment, 2) body weight perception, and 3) affective appraisal towards their avatar. We found that the different SODs are unlikely to influence any of our measures except for the perceived body weight estimation difficulty. Here, the participants perceived the difficulty significantly higher for the farthest SOD. We further found that the participants’ self-esteem significantly impacted their ability to modify their avatar’s body weight to their current body weight and that it positively correlated with the perceived attractiveness of the avatar. Additionally, the participants’ concerns about their body shape affected how eerie they perceived their avatars. The participants’ self-esteem and concerns about their body shape influenced the perceived body weight estimation difficulty. We conclude that the virtual mirror in embodiment scenarios can be freely placed and varied at a distance of one to four meters from the user without expecting major effects on the perception of the avatar.
Background: The rehabilitation of gait disorders in patients with multiple sclerosis (MS) and stroke is often based on conventional treadmill training. Virtual reality (VR)-based treadmill training can increase motivation and improve therapy outcomes. The present study evaluated an immersive virtual reality application (using a head-mounted display, HMD) for gait rehabilitation with patients to (1) demonstrate its feasibility and acceptance and to (2) compare its short-term effects to a semi-immersive presentation (using a monitor) and a conventional treadmill training without VR to assess the usability of both systems and estimate the effects on walking speed and motivation. Methods: In a within-subjects study design, 36 healthy participants and 14 persons with MS or stroke participated in each of the three experimental conditions (VR via HMD, VR via monitor, treadmill training without VR). Results: For both groups, the walking speed in the HMD condition was higher than in treadmill training without VR and in the monitor condition. Healthy participants reported a higher motivation after the HMD condition as compared with the other conditions. Importantly, no side effects in the sense of simulator sickness occurred and usability ratings were high. No increases in heart rate were observed following the VR conditions. Presence ratings were higher for the HMD condition compared with the monitor condition for both user groups. Most of the healthy study participants (89%) and patients (71%) preferred the HMD-based training among the three conditions and most patients could imagine using it more frequently. Conclusions For the first time, the present study evaluated the usability of an immersive VR system for gait rehabilitation in a direct comparison with a semi-immersive system and a conventional training without VR with healthy participants and patients. The study demonstrated the feasibility of combining a treadmill training with immersive VR. Due to its high usability and low side effects, it might be particularly suited for patients to improve training motivation and training outcome e. g. the walking speed compared with treadmill training using no or only semi-immersive VR. Immersive VR systems still require specific technical setup procedures. This should be taken into account for specific clinical use-cases during a cost-benefit assessment.
Artificial Intelligence (AI) covers a broad spectrum of computational problems and use cases. Many of those implicate profound and sometimes intricate questions of how humans interact or should interact with AIs. Moreover, many users or future users do have abstract ideas of what AI is, significantly depending on the specific embodiment of AI applications. Human-centered-design approaches would suggest evaluating the impact of different embodiments on human perception of and interaction with AI. An approach that is difficult to realize due to the sheer complexity of application fields and embodiments in reality. However, here XR opens new possibilities to research human-AI interactions. The article’s contribution is twofold: First, it provides a theoretical treatment and model of human-AI interaction based on an XR-AI continuum as a framework for and a perspective of different approaches of XR-AI combinations. It motivates XR-AI combinations as a method to learn about the effects of prospective human-AI interfaces and shows why the combination of XR and AI fruitfully contributes to a valid and systematic investigation of human-AI interactions and interfaces. Second, the article provides two exemplary experiments investigating the aforementioned approach for two distinct AI-systems. The first experiment reveals an interesting gender effect in human-robot interaction, while the second experiment reveals an Eliza effect of a recommender system. Here the article introduces two paradigmatic implementations of the proposed XR testbed for human-AI interactions and interfaces and shows how a valid and systematic investigation can be conducted. In sum, the article opens new perspectives on how XR benefits human-centered AI design and development.
Plenty of theories, models, measures, and investigations target the understanding of virtual presence, i.e., the sense of presence in immersive Virtual Reality (VR). Other varieties of the so-called eXtended Realities (XR), e.g., Augmented and Mixed Reality (AR and MR) incorporate immersive features to a lesser degree and continuously combine spatial cues from the real physical space and the simulated virtual space. This blurred separation questions the applicability of the accumulated knowledge about the similarities of virtual presence and presence occurring in other varieties of XR, and corresponding outcomes. The present work bridges this gap by analyzing the construct of presence in mixed realities (MR). To achieve this, the following presents (1) a short review of definitions, dimensions, and measurements of presence in VR, and (2) the state of the art views on MR. Additionally, we (3) derived a working definition of MR, extending the Milgram continuum. This definition is based on entities reaching from real to virtual manifestations at one time point. Entities possess different degrees of referential power, determining the selection of the frame of reference. Furthermore, we (4) identified three research desiderata, including research questions about the frame of reference, the corresponding dimension of transportation, and the dimension of realism in MR. Mainly the relationship between the main aspects of virtual presence of immersive VR, i.e., the place-illusion, and the plausibility-illusion, and of the referential power of MR entities are discussed regarding the concept, measures, and design of presence in MR. Finally, (5) we suggested an experimental setup to reveal the research heuristic behind experiments investigating presence in MR. The present work contributes to the theories and the meaning of and approaches to simulate and measure presence in MR. We hypothesize that research about essential underlying factors determining user experience (UX) in MR simulations and experiences is still in its infancy and hopes this article provides an encouraging starting point to tackle related questions.
With the increasing adaptability and complexity of advisory artificial intelligence (AI)-based agents, the topics of explainable AI and human-centered AI are moving close together. Variations in the explanation itself have been widely studied, with some contradictory results. These could be due to users’ individual differences, which have rarely been systematically studied regarding their inhibiting or enabling effect on the fulfillment of explanation objectives (such as trust, understanding, or workload). This paper aims to shed light on the significance of human dimensions (gender, age, trust disposition, need for cognition, affinity for technology, self-efficacy, attitudes, and mind attribution) as well as their interplay with different explanation modes (no, simple, or complex explanation). Participants played the game Deal or No Deal while interacting with an AI-based agent. The agent gave advice to the participants on whether they should accept or reject the deals offered to them. As expected, giving an explanation had a positive influence on the explanation objectives. However, the users’ individual characteristics particularly reinforced the fulfillment of the objectives. The strongest predictor of objective fulfillment was the degree of attribution of human characteristics. The more human characteristics were attributed, the more trust was placed in the agent, advice was more likely to be accepted and understood, and important needs were satisfied during the interaction. Thus, the current work contributes to a better understanding of the design of explanations of an AI-based agent system that takes into account individual characteristics and meets the demand for both explainable and human-centered agent systems.
Social patterns and roles can develop when users talk to intelligent voice assistants (IVAs) daily. The current study investigates whether users assign different roles to devices and how this affects their usage behavior, user experience, and social perceptions. Since social roles take time to establish, we equipped 106 participants with Alexa or Google assistants and some smart home devices and observed their interactions for nine months. We analyzed diverse subjective (questionnaire) and objective data (interaction data). By combining social science and data science analyses, we identified two distinct clusters—users who assigned a friendship role to IVAs over time and users who did not. Interestingly, these clusters exhibited significant differences in their usage behavior, user experience, and social perceptions of the devices. For example, participants who assigned a role to IVAs attributed more friendship to them used them more frequently, reported more enjoyment during interactions, and perceived more empathy for IVAs. In addition, these users had distinct personal requirements, for example, they reported more loneliness. This study provides valuable insights into the role-specific effects and consequences of voice assistants. Recent developments in conversational language models such as ChatGPT suggest that the findings of this study could make an important contribution to the design of dialogic human–AI interactions.
Even today, the automatic digitisation of scanned documents in general, but especially the automatic optical music recognition (OMR) of historical manuscripts, still remains an enormous challenge, since both handwritten musical symbols and text have to be identified. This paper focuses on the Medieval so-called square notation developed in the 11th–12th century, which is already composed of staff lines, staves, clefs, accidentals, and neumes that are roughly spoken connected single notes. The aim is to develop an algorithm that captures both the neumes, and in particular its melody, which can be used to reconstruct the original writing. Our pipeline is similar to the standard OMR approach and comprises a novel staff line and symbol detection algorithm based on deep Fully Convolutional Networks (FCN), which perform pixel-based predictions for either staff lines or symbols and their respective types. Then, the staff line detection combines the extracted lines to staves and yields an F\(_1\) -score of over 99% for both detecting lines and complete staves. For the music symbol detection, we choose a novel approach that skips the step to identify neumes and instead directly predicts note components (NCs) and their respective affiliation to a neume. Furthermore, the algorithm detects clefs and accidentals. Our algorithm predicts the symbol sequence of a staff with a diplomatic symbol accuracy rate (dSAR) of about 87%, which includes symbol type and location. If only the NCs without their respective connection to a neume, all clefs and accidentals are of interest, the algorithm reaches an harmonic symbol accuracy rate (hSAR) of approximately 90%. In general, the algorithm recognises a symbol in the manuscript with an F\(_1\) -score of over 96%.