004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (78)
Is part of the Bibliography
- yes (78)
Year of publication
Document Type
- Journal article (78) (remove)
Keywords
- virtual reality (11)
- machine learning (5)
- augmented reality (3)
- immersion (3)
- Deep learning (2)
- Quadrocopter (2)
- Quadrotor (2)
- XR (2)
- artificial intelligence (2)
- automation (2)
- deep learning (2)
- education (2)
- endoscopy (2)
- exposure (2)
- fully convolutional neural networks (2)
- gastroenterology (2)
- historical document analysis (2)
- human-computer interaction (2)
- navigation (2)
- neural networks (2)
- ontology (2)
- prediction (2)
- self-aware computing (2)
- virtual environments (2)
- 3D-reconstruction methods (1)
- 3DTK toolkit (1)
- 4D-GIS (1)
- AVA (1)
- Aufwandsanalyse (1)
- Automatisierte Prüfungskorrektur (1)
- Autonomous UAV (1)
- Brüder Grimm Privatbibliothek (1)
- CLIP (1)
- Convolutional Neural Network (1)
- Cost Analysis (1)
- DNA storage (1)
- EPM (1)
- Educational Measurement (I2.399) (1)
- Entscheidungsfindung (1)
- Erkennung handschriftlicher Artefakte (1)
- Ethik (1)
- Forces (1)
- GNSS/INS integrated navigation (1)
- Grimm brothers personal library (1)
- HMD (Head-Mounted Display) (1)
- HTTP adaptive video streaming (1)
- INS/LIDAR integrated navigation (1)
- IT security (1)
- Image Aesthetic Assessment (1)
- Informatik (1)
- Intelligent Virtual Agents (1)
- InteractionSuitcase (1)
- Internet of Things (1)
- IoT (1)
- Kerneldensity estimation (1)
- Klima (1)
- Künstliche Intelligenz (1)
- LoRaWAN (1)
- Modell (1)
- Multiple-Choice Examination (1)
- Multiple-Choice Prüfungen (1)
- NP-hardness (1)
- Neuronales Netz (1)
- Optical Flow (1)
- Poisson surface reconstruction (1)
- RGB-D (1)
- Robotics (1)
- Self-Evaluation Programs (I2.399.780) (1)
- Structure-from-Motion (1)
- Terramechanics (1)
- Torque (1)
- WhatsApp (1)
- Wheel (1)
- XR-artificial intelligence combination (1)
- XR-artificial intelligence continuum (1)
- YouTube (1)
- acrophobia (1)
- adaptation models (1)
- adult learning (1)
- aerodynamics (1)
- agents (1)
- annotation (1)
- anomaly detection (1)
- anomaly prediction (1)
- ant-colony optimization (1)
- anthropomorphism (1)
- anxiety (1)
- application design (1)
- approximation algorithm (1)
- arithmetic calculations (1)
- autonomous (1)
- autonomous UAV (1)
- availability (1)
- avatar embodiment (1)
- avatars (1)
- background knowledge (1)
- baseline detection (1)
- behavior perception (1)
- binary tanglegram (1)
- biosignals (1)
- camera orientation (1)
- carbon (1)
- certifying algorithm (1)
- chain cover (1)
- channel management (1)
- climate (1)
- co-authorships (1)
- co-inventorships (1)
- coherence (1)
- collaboration (1)
- collision (1)
- communication models (1)
- communication networks (1)
- congruence (1)
- content-based image retrieval (1)
- continuous-time SLAM (1)
- convex bipartite graph (1)
- convolutional neural network (1)
- cost-sensitive learning (1)
- crossing minimization (1)
- crowdsensing (1)
- crowdsourced measurements (1)
- cultural and media studies (1)
- culturally aware (1)
- data warehouse (1)
- decision support system (1)
- decision-making (1)
- deep metric learning (1)
- deformation-based method (1)
- design (1)
- design cycle (1)
- detection time simulation (1)
- digital twin (1)
- dimensions of proximity (1)
- distributed control (1)
- dynamic programming (1)
- eHealth (1)
- educational tool (1)
- electronic health records (1)
- elevated plus-maze (1)
- embedding techniques (1)
- emotions (1)
- encryption (1)
- endurance (1)
- event detection (1)
- exercise intensity (1)
- experience (1)
- experimental evaluation (1)
- extended reality (XR) (1)
- failure prediction (1)
- fault detection (1)
- feature matching (1)
- few-shot learning (1)
- fixed-parameter tractability (1)
- food quality (1)
- foreign language learning and teaching (1)
- formation flight (1)
- fruit temperature (1)
- future energy grid exploration (1)
- games (1)
- gamification (1)
- genetic algorithm (1)
- graph algorithm (1)
- group-based communication (1)
- handwriting (1)
- handwritten artefact recognition (1)
- hierarchy (1)
- historical images (1)
- hospital data (1)
- human body weight (1)
- human computer interaction (HCI) (1)
- human-artificial intelligence interaction (1)
- human-artificial intelligence interface (1)
- human-centered design (1)
- human-centered, human-robot (1)
- human–computer interaction (1)
- illusion of self-motion (1)
- image classification (1)
- image processing (1)
- imbalanced regression (1)
- immersive classroom (1)
- immersive classroom management (1)
- immersive technologies (1)
- implicit association test (1)
- induced matching (1)
- informal education (1)
- information extraction (1)
- information systems and information technology (1)
- intelligent transportation systems (1)
- intelligent vehicles (1)
- intelligent virtual agents (1)
- intelligent voice assistant (1)
- intercultural learning and teaching (1)
- interdisciplinary education (1)
- internet traffic (1)
- invasive vascular interventions (1)
- iowa gambling task (1)
- key-insight extraction (1)
- kinect (1)
- language-image pre-training (1)
- layout recognition (1)
- learning environments (1)
- light-gated proteins (1)
- local energy system (1)
- logistics (1)
- long-term analysis (1)
- map projections (1)
- mapping (1)
- mathematical model (1)
- measurements (1)
- media analysis (1)
- medical records (1)
- medieval manuscripts (1)
- meditation (1)
- mindfulness (1)
- misconceptions (1)
- mixed reality (1)
- mixed-cultural (1)
- mixed-cultural settings (1)
- mobile instant messaging (1)
- mobile messaging application (1)
- mobile networks (1)
- mobile streaming (1)
- model following (1)
- model output statistics (1)
- model-based diagnosis (1)
- multimodal fusion (1)
- multimodal interface (1)
- multiple myeloma (1)
- multirotors (1)
- multiscale encoder (1)
- nano-satellite (1)
- nanocellulose (1)
- natural interfaces (1)
- natural language processing (1)
- neume notation (1)
- neural architecture (1)
- non-native accent (1)
- object detection (1)
- octree (1)
- optical music recognition (1)
- passage of time (1)
- passive haptic feedback (1)
- perception (1)
- performance (1)
- performance analysis (1)
- performance prediction (1)
- place-illusion (1)
- plausibility (1)
- plausibility-illusion (1)
- point cloud (1)
- point cloud compression (1)
- point-to-plane measure (1)
- point-to-point measure (1)
- pollution (1)
- positioning (1)
- precision horticulture (1)
- precision training (1)
- presence (1)
- private chat groups (1)
- procedural fusion methods (1)
- prompt engineering (1)
- protein chip (1)
- psychophyisology (1)
- public speaking (1)
- quadcopter (1)
- quadcopters (1)
- quality assurance (1)
- quality evaluation (1)
- quality of experience (1)
- quality of experience prediction (1)
- radiology (1)
- ransomware (1)
- real world evidence (1)
- real-world application (1)
- realism (1)
- recommender system (1)
- regelbasierte Nachbearbeitung (1)
- research methods (1)
- rich vehicle routing problem (1)
- robustness (1)
- rotors (1)
- rule based post processing (1)
- sample weighting (1)
- satisfiability problems (1)
- scalability (1)
- scalable quadcopter (1)
- scheduling (1)
- science, technology and society (1)
- secure group communication (1)
- segmentation (1)
- self-adaptive (1)
- self-adaptive systems (1)
- self-aware computing systems (1)
- self-managing systems (1)
- semantic fusion (1)
- sensor fusion (1)
- sentinel (1)
- serious games (1)
- sesnsors (1)
- simulation (1)
- simulation system (1)
- single-electron transistors (1)
- sketching (1)
- smart meter data utilization (1)
- smart speaker (1)
- social VR (1)
- social interaction (1)
- social relationship (1)
- social robot (1)
- social robotics (1)
- social role (1)
- socially interactive agents (1)
- spatial presence (1)
- statistical validity (1)
- statistics and numerical data (1)
- stereotypes (1)
- stroke (1)
- student simulation (1)
- stylus (1)
- sunburn (1)
- supervised learning (1)
- surface model (1)
- survey (1)
- switching navigation (1)
- systematic literature review (1)
- systematic review (1)
- table extraction (1)
- table understanding (1)
- taxonomy (1)
- teacher education (1)
- technology-supported learning (1)
- text line detection (1)
- text supervision (1)
- theory (1)
- therapeutic application (1)
- thermal camera (1)
- thermal point cloud (1)
- time calibration (1)
- time perception (1)
- tools (1)
- trait anxiety (1)
- transformer (1)
- translational neuroscience (1)
- transport microenvironments (1)
- transportation (1)
- unmanned aerial vehicle (1)
- unmanned aerial vehicles (1)
- usability evaluation (1)
- use cases (1)
- user experience (1)
- user interaction (1)
- user study (1)
- vection (1)
- vehicle dynamics (1)
- vehicular navigation (1)
- verbal behaviour (1)
- virtual agent (1)
- virtual agent interaction (1)
- virtual humans (1)
- virtual reality training (1)
- virtual stimuli (1)
- virtual tunnel (1)
- virtual-reality-continuum (1)
- waypoint parameter (1)
- wearable (1)
Institute
- Institut für Informatik (78) (remove)
CLIP knows image aesthetics
(2022)
Most Image Aesthetic Assessment (IAA) methods use a pretrained ImageNet classification model as a base to fine-tune. We hypothesize that content classification is not an optimal pretraining task for IAA, since the task discourages the extraction of features that are useful for IAA, e.g., composition, lighting, or style. On the other hand, we argue that the Contrastive Language-Image Pretraining (CLIP) model is a better base for IAA models, since it has been trained using natural language supervision. Due to the rich nature of language, CLIP needs to learn a broad range of image features that correlate with sentences describing the image content, composition, environments, and even subjective feelings about the image. While it has been shown that CLIP extracts features useful for content classification tasks, its suitability for tasks that require the extraction of style-based features like IAA has not yet been shown. We test our hypothesis by conducting a three-step study, investigating the usefulness of features extracted by CLIP compared to features obtained from the last layer of a comparable ImageNet classification model. In each step, we get more computationally expensive. First, we engineer natural language prompts that let CLIP assess an image's aesthetic without adjusting any weights in the model. To overcome the challenge that CLIP's prompting only is applicable to classification tasks, we propose a simple but effective strategy to convert multiple prompts to a continuous scalar as required when predicting an image's mean aesthetic score. Second, we train a linear regression on the AVA dataset using image features obtained by CLIP's image encoder. The resulting model outperforms a linear regression trained on features from an ImageNet classification model. It also shows competitive performance with fully fine-tuned networks based on ImageNet, while only training a single layer. Finally, by fine-tuning CLIP's image encoder on the AVA dataset, we show that CLIP only needs a fraction of training epochs to converge, while also performing better than a fine-tuned ImageNet model. Overall, our experiments suggest that CLIP is better suited as a base model for IAA methods than ImageNet pretrained networks.
This article presents a novel method for controlling a virtual audience system (VAS) in Virtual Reality (VR) application, called STAGE, which has been originally designed for supervised public speaking training in university seminars dedicated to the preparation and delivery of scientific talks. We are interested in creating pedagogical narratives: narratives encompass affective phenomenon and rather than organizing events changing the course of a training scenario, pedagogical plans using our system focus on organizing the affects it arouses for the trainees. Efficiently controlling a virtual audience towards a specific training objective while evaluating the speaker’s performance presents a challenge for a seminar instructor: the high level of cognitive and physical demands required to be able to control the virtual audience, whilst evaluating speaker’s performance, adjusting and allowing it to quickly react to the user’s behaviors and interactions. It is indeed a critical limitation of a number of existing systems that they rely on a Wizard of Oz approach, where the tutor drives the audience in reaction to the user’s performance. We address this problem by integrating with a VAS a high-level control component for tutors, which allows using predefined audience behavior rules, defining custom ones, as well as intervening during run-time for finer control of the unfolding of the pedagogical plan. At its core, this component offers a tool to program, select, modify and monitor interactive training narratives using a high-level representation. The STAGE offers the following features: i) a high-level API to program pedagogical narratives focusing on a specific public speaking situation and training objectives, ii) an interactive visualization interface iii) computation and visualization of user metrics, iv) a semi-autonomous virtual audience composed of virtual spectators with automatic reactions to the speaker and surrounding spectators while following the pedagogical plan V) and the possibility for the instructor to embody a virtual spectator to ask questions or guide the speaker from within the Virtual Environment. We present here the design, and implementation of the tutoring system and its integration in STAGE, and discuss its reception by end-users.
Virtual environments (VEs) can evoke and support emotions, as experienced when playing emotionally arousing games. We theoretically approach the design of fear and joy evoking VEs based on a literature review of empirical studies on virtual and real environments as well as video games’ reviews and content analyses. We define the design space and identify central design elements that evoke specific positive and negative emotions. Based on that, we derive and present guidelines for emotion-inducing VE design with respect to design themes, colors and textures, and lighting configurations. To validate our guidelines in two user studies, we 1) expose participants to 360° videos of VEs designed following the individual guidelines and 2) immerse them in a neutral, positive and negative emotion-inducing VEs combining all respective guidelines in Virtual Reality. The results support our theoretically derived guidelines by revealing significant differences in terms of fear and joy induction.
The rapid development of green and sustainable materials opens up new possibilities in the field of applied research. Such materials include nanocellulose composites that can integrate many components into composites and provide a good chassis for smart devices. In our study, we evaluate four approaches for turning a nanocellulose composite into an information storage or processing device: 1) nanocellulose can be a suitable carrier material and protect information stored in DNA. 2) Nucleotide-processing enzymes (polymerase and exonuclease) can be controlled by light after fusing them with light-gating domains; nucleotide substrate specificity can be changed by mutation or pH change (read-in and read-out of the information). 3) Semiconductors and electronic capabilities can be achieved: we show that nanocellulose is rendered electronic by iodine treatment replacing silicon including microstructures. Nanocellulose semiconductor properties are measured, and the resulting potential including single-electron transistors (SET) and their properties are modeled. Electric current can also be transported by DNA through G-quadruplex DNA molecules; these as well as classical silicon semiconductors can easily be integrated into the nanocellulose composite. 4) To elaborate upon miniaturization and integration for a smart nanocellulose chip device, we demonstrate pH-sensitive dyes in nanocellulose, nanopore creation, and kinase micropatterning on bacterial membranes as well as digital PCR micro-wells. Future application potential includes nano-3D printing and fast molecular processors (e.g., SETs) integrated with DNA storage and conventional electronics. This would also lead to environment-friendly nanocellulose chips for information processing as well as smart nanocellulose composites for biomedical applications and nano-factories.
A key feature for Internet of Things (IoT) is to control what content is available to each user. To handle this access management, encryption schemes can be used. Due to the diverse usage of encryption schemes, there are various realizations of 1-to-1, 1-to-n, and n-to-n schemes in the literature. This multitude of encryption methods with a wide variety of properties presents developers with the challenge of selecting the optimal method for a particular use case, which is further complicated by the fact that there is no overview of existing encryption schemes. To fill this gap, we envision a cryptography encyclopedia providing such an overview of existing encryption schemes. In this survey paper, we take a first step towards such an encyclopedia by creating a sub-encyclopedia for secure group communication (SGC) schemes, which belong to the n-to-n category. We extensively surveyed the state-of-the-art and classified 47 different schemes. More precisely, we provide (i) a comprehensive overview of the relevant security features, (ii) a set of relevant performance metrics, (iii) a classification for secure group communication schemes, and (iv) workflow descriptions of the 47 schemes. Moreover, we perform a detailed performance and security evaluation of the 47 secure group communication schemes. Based on this evaluation, we create a guideline for the selection of secure group communication schemes.
Around 4.9 billion Internet users worldwide watch billions of hours of online video every day. As a result, streaming is by far the predominant type of traffic in communication networks. According to Google statistics, three out of five video views come from mobile devices. Thus, in view of the continuous technological advances in end devices and increasing mobile use, datasets for mobile streaming are indispensable in research but only sparsely dealt with in literature so far. With this public dataset, we provide 1,081 hours of time-synchronous video measurements at network, transport, and application layer with the native YouTube streaming client on mobile devices. The dataset includes 80 network scenarios with 171 different individual bandwidth settings measured in 5,181 runs with limited bandwidth, 1,939 runs with emulated 3 G/4 G traces, and 4,022 runs with pre-defined bandwidth changes. This corresponds to 332 GB video payload. We present the most relevant quality indicators for scientific use, i.e., initial playback delay, streaming video quality, adaptive video quality changes, video rebuffering events, and streaming phases.
Background
Machine learning, especially deep learning, is becoming more and more relevant in research and development in the medical domain. For all the supervised deep learning applications, data is the most critical factor in securing successful implementation and sustaining the progress of the machine learning model. Especially gastroenterological data, which often involves endoscopic videos, are cumbersome to annotate. Domain experts are needed to interpret and annotate the videos. To support those domain experts, we generated a framework. With this framework, instead of annotating every frame in the video sequence, experts are just performing key annotations at the beginning and the end of sequences with pathologies, e.g., visible polyps. Subsequently, non-expert annotators supported by machine learning add the missing annotations for the frames in-between.
Methods
In our framework, an expert reviews the video and annotates a few video frames to verify the object’s annotations for the non-expert. In a second step, a non-expert has visual confirmation of the given object and can annotate all following and preceding frames with AI assistance. After the expert has finished, relevant frames will be selected and passed on to an AI model. This information allows the AI model to detect and mark the desired object on all following and preceding frames with an annotation. Therefore, the non-expert can adjust and modify the AI predictions and export the results, which can then be used to train the AI model.
Results
Using this framework, we were able to reduce workload of domain experts on average by a factor of 20 on our data. This is primarily due to the structure of the framework, which is designed to minimize the workload of the domain expert. Pairing this framework with a state-of-the-art semi-automated AI model enhances the annotation speed further. Through a prospective study with 10 participants, we show that semi-automated annotation using our tool doubles the annotation speed of non-expert annotators compared to a well-known state-of-the-art annotation tool.
Conclusion
In summary, we introduce a framework for fast expert annotation for gastroenterologists, which reduces the workload of the domain expert considerably while maintaining a very high annotation quality. The framework incorporates a semi-automated annotation system utilizing trained object detection models. The software and framework are open-source.
Towards LoRaWAN without data loss: studying the performance of different channel access approaches
(2022)
The Long Range Wide Area Network (LoRaWAN) is one of the fastest growing Internet of Things (IoT) access protocols. It operates in the license free 868 MHz band and gives everyone the possibility to create their own small sensor networks. The drawback of this technology is often unscheduled or random channel access, which leads to message collisions and potential data loss. For that reason, recent literature studies alternative approaches for LoRaWAN channel access. In this work, state-of-the-art random channel access is compared with alternative approaches from the literature by means of collision probability. Furthermore, a time scheduled channel access methodology is presented to completely avoid collisions in LoRaWAN. For this approach, an exhaustive simulation study was conducted and the performance was evaluated with random access cross-traffic. In a general theoretical analysis the limits of the time scheduled approach are discussed to comply with duty cycle regulations in LoRaWAN.
Presence is often considered the most important quale describing the subjective feeling of being in a computer-generated and/or computer-mediated virtual environment. The identification and separation of orthogonal presence components, i.e., the place illusion and the plausibility illusion, has been an accepted theoretical model describing Virtual Reality (VR) experiences for some time. This perspective article challenges this presence-oriented VR theory. First, we argue that a place illusion cannot be the major construct to describe the much wider scope of virtual, augmented, and mixed reality (VR, AR, MR: or XR for short). Second, we argue that there is no plausibility illusion but merely plausibility, and we derive the place illusion caused by the congruent and plausible generation of spatial cues and similarly for all the current model’s so-defined illusions. Finally, we propose congruence and plausibility to become the central essential conditions in a novel theoretical model describing XR experiences and effects.
Crowdsensing offers a cost-effective way to collect large amounts of environmental sensor data; however, the spatial distribution of crowdsensing sensors can hardly be influenced, as the participants carry the sensors, and, additionally, the quality of the crowdsensed data can vary significantly. Hybrid systems that use mobile users in conjunction with fixed sensors might help to overcome these limitations, as such systems allow assessing the quality of the submitted crowdsensed data and provide sensor values where no crowdsensing data are typically available. In this work, we first used a simulation study to analyze a simple crowdsensing system concerning the detection performance of spatial events to highlight the potential and limitations of a pure crowdsourcing system. The results indicate that even if only a small share of inhabitants participate in crowdsensing, events that have locations correlated with the population density can be easily and quickly detected using such a system. On the contrary, events with uniformly randomly distributed locations are much harder to detect using a simple crowdsensing-based approach. A second evaluation shows that hybrid systems improve the detection probability and time. Finally, we illustrate how to compute the minimum number of fixed sensors for the given detection time thresholds in our exemplary scenario.