004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (76)
Is part of the Bibliography
- yes (76)
Year of publication
Document Type
- Journal article (76) (remove)
Keywords
- virtual reality (11)
- machine learning (4)
- augmented reality (3)
- immersion (3)
- Deep learning (2)
- Quadrocopter (2)
- Quadrotor (2)
- XR (2)
- artificial intelligence (2)
- education (2)
- exposure (2)
- fully convolutional neural networks (2)
- historical document analysis (2)
- human-computer interaction (2)
- navigation (2)
- neural networks (2)
- ontology (2)
- prediction (2)
- self-aware computing (2)
- virtual environments (2)
- 3D-reconstruction methods (1)
- 3DTK toolkit (1)
- 4D-GIS (1)
- AVA (1)
- Aufwandsanalyse (1)
- Automatisierte Prüfungskorrektur (1)
- Autonomous UAV (1)
- Brüder Grimm Privatbibliothek (1)
- CLIP (1)
- Convolutional Neural Network (1)
- Cost Analysis (1)
- DNA storage (1)
- EPM (1)
- Educational Measurement (I2.399) (1)
- Entscheidungsfindung (1)
- Erkennung handschriftlicher Artefakte (1)
- Ethik (1)
- Forces (1)
- GNSS/INS integrated navigation (1)
- Grimm brothers personal library (1)
- HMD (Head-Mounted Display) (1)
- HTTP adaptive video streaming (1)
- INS/LIDAR integrated navigation (1)
- IT security (1)
- Image Aesthetic Assessment (1)
- Informatik (1)
- Intelligent Virtual Agents (1)
- InteractionSuitcase (1)
- Internet of Things (1)
- IoT (1)
- Kerneldensity estimation (1)
- Klima (1)
- Künstliche Intelligenz (1)
- LoRaWAN (1)
- Modell (1)
- Multiple-Choice Examination (1)
- Multiple-Choice Prüfungen (1)
- NP-hardness (1)
- Neuronales Netz (1)
- Optical Flow (1)
- Poisson surface reconstruction (1)
- RGB-D (1)
- Robotics (1)
- Self-Evaluation Programs (I2.399.780) (1)
- Structure-from-Motion (1)
- Terramechanics (1)
- Torque (1)
- WhatsApp (1)
- Wheel (1)
- XR-artificial intelligence combination (1)
- XR-artificial intelligence continuum (1)
- YouTube (1)
- acrophobia (1)
- adaptation models (1)
- adult learning (1)
- aerodynamics (1)
- agents (1)
- annotation (1)
- anomaly detection (1)
- anomaly prediction (1)
- ant-colony optimization (1)
- anthropomorphism (1)
- anxiety (1)
- application design (1)
- approximation algorithm (1)
- arithmetic calculations (1)
- automation (1)
- autonomous (1)
- autonomous UAV (1)
- availability (1)
- avatar embodiment (1)
- avatars (1)
- background knowledge (1)
- baseline detection (1)
- behavior perception (1)
- binary tanglegram (1)
- biosignals (1)
- camera orientation (1)
- carbon (1)
- certifying algorithm (1)
- chain cover (1)
- channel management (1)
- climate (1)
- co-authorships (1)
- co-inventorships (1)
- coherence (1)
- collaboration (1)
- collision (1)
- communication models (1)
- communication networks (1)
- congruence (1)
- content-based image retrieval (1)
- continuous-time SLAM (1)
- convex bipartite graph (1)
- convolutional neural network (1)
- cost-sensitive learning (1)
- crossing minimization (1)
- crowdsensing (1)
- crowdsourced measurements (1)
- cultural and media studies (1)
- culturally aware (1)
- data warehouse (1)
- decision-making (1)
- deep learning (1)
- deformation-based method (1)
- design (1)
- design cycle (1)
- detection time simulation (1)
- dimensions of proximity (1)
- distributed control (1)
- dynamic programming (1)
- eHealth (1)
- educational tool (1)
- electronic health records (1)
- elevated plus-maze (1)
- embedding techniques (1)
- emotions (1)
- encryption (1)
- endoscopy (1)
- endurance (1)
- event detection (1)
- exercise intensity (1)
- experience (1)
- experimental evaluation (1)
- extended reality (XR) (1)
- failure prediction (1)
- fault detection (1)
- feature matching (1)
- fixed-parameter tractability (1)
- food quality (1)
- foreign language learning and teaching (1)
- formation flight (1)
- fruit temperature (1)
- games (1)
- gamification (1)
- gastroenterology (1)
- genetic algorithm (1)
- graph algorithm (1)
- group-based communication (1)
- handwriting (1)
- handwritten artefact recognition (1)
- hierarchy (1)
- historical images (1)
- hospital data (1)
- human body weight (1)
- human computer interaction (HCI) (1)
- human-artificial intelligence interaction (1)
- human-artificial intelligence interface (1)
- human-centered design (1)
- human-centered, human-robot (1)
- human–computer interaction (1)
- illusion of self-motion (1)
- image processing (1)
- imbalanced regression (1)
- immersive classroom (1)
- immersive classroom management (1)
- immersive technologies (1)
- implicit association test (1)
- induced matching (1)
- informal education (1)
- information extraction (1)
- information systems and information technology (1)
- intelligent transportation systems (1)
- intelligent vehicles (1)
- intelligent virtual agents (1)
- intelligent voice assistant (1)
- intercultural learning and teaching (1)
- interdisciplinary education (1)
- internet traffic (1)
- invasive vascular interventions (1)
- iowa gambling task (1)
- key-insight extraction (1)
- kinect (1)
- language-image pre-training (1)
- layout recognition (1)
- learning environments (1)
- light-gated proteins (1)
- logistics (1)
- long-term analysis (1)
- map projections (1)
- mapping (1)
- mathematical model (1)
- measurements (1)
- media analysis (1)
- medical records (1)
- medieval manuscripts (1)
- meditation (1)
- mindfulness (1)
- misconceptions (1)
- mixed reality (1)
- mixed-cultural (1)
- mixed-cultural settings (1)
- mobile instant messaging (1)
- mobile messaging application (1)
- mobile networks (1)
- mobile streaming (1)
- model following (1)
- model output statistics (1)
- model-based diagnosis (1)
- multimodal fusion (1)
- multimodal interface (1)
- multiple myeloma (1)
- multirotors (1)
- multiscale encoder (1)
- nano-satellite (1)
- nanocellulose (1)
- natural interfaces (1)
- natural language processing (1)
- neume notation (1)
- neural architecture (1)
- non-native accent (1)
- object detection (1)
- octree (1)
- optical music recognition (1)
- passage of time (1)
- passive haptic feedback (1)
- perception (1)
- performance (1)
- performance analysis (1)
- performance prediction (1)
- place-illusion (1)
- plausibility (1)
- plausibility-illusion (1)
- point cloud (1)
- point cloud compression (1)
- point-to-plane measure (1)
- point-to-point measure (1)
- pollution (1)
- positioning (1)
- precision horticulture (1)
- precision training (1)
- presence (1)
- private chat groups (1)
- procedural fusion methods (1)
- prompt engineering (1)
- protein chip (1)
- psychophyisology (1)
- public speaking (1)
- quadcopter (1)
- quadcopters (1)
- quality assurance (1)
- quality evaluation (1)
- quality of experience (1)
- quality of experience prediction (1)
- radiology (1)
- ransomware (1)
- real world evidence (1)
- real-world application (1)
- realism (1)
- recommender system (1)
- regelbasierte Nachbearbeitung (1)
- research methods (1)
- rich vehicle routing problem (1)
- robustness (1)
- rotors (1)
- rule based post processing (1)
- sample weighting (1)
- satisfiability problems (1)
- scalability (1)
- scalable quadcopter (1)
- scheduling (1)
- science, technology and society (1)
- secure group communication (1)
- segmentation (1)
- self-adaptive (1)
- self-adaptive systems (1)
- self-aware computing systems (1)
- self-managing systems (1)
- semantic fusion (1)
- sensor fusion (1)
- sentinel (1)
- serious games (1)
- sesnsors (1)
- simulation system (1)
- single-electron transistors (1)
- sketching (1)
- smart speaker (1)
- social VR (1)
- social interaction (1)
- social relationship (1)
- social robot (1)
- social robotics (1)
- social role (1)
- socially interactive agents (1)
- spatial presence (1)
- statistical validity (1)
- statistics and numerical data (1)
- stereotypes (1)
- stroke (1)
- student simulation (1)
- stylus (1)
- sunburn (1)
- supervised learning (1)
- surface model (1)
- survey (1)
- switching navigation (1)
- systematic literature review (1)
- systematic review (1)
- table extraction (1)
- table understanding (1)
- taxonomy (1)
- teacher education (1)
- technology-supported learning (1)
- text line detection (1)
- text supervision (1)
- theory (1)
- therapeutic application (1)
- thermal camera (1)
- thermal point cloud (1)
- time calibration (1)
- time perception (1)
- tools (1)
- trait anxiety (1)
- translational neuroscience (1)
- transport microenvironments (1)
- transportation (1)
- unmanned aerial vehicle (1)
- unmanned aerial vehicles (1)
- usability evaluation (1)
- use cases (1)
- user experience (1)
- user interaction (1)
- user study (1)
- vection (1)
- vehicle dynamics (1)
- vehicular navigation (1)
- verbal behaviour (1)
- virtual agent (1)
- virtual agent interaction (1)
- virtual humans (1)
- virtual reality training (1)
- virtual stimuli (1)
- virtual tunnel (1)
- virtual-reality-continuum (1)
- waypoint parameter (1)
- wearable (1)
Institute
- Institut für Informatik (76) (remove)
Scalability is often mentioned in literature, but a stringent definition is missing. In particular, there is no general scalability assessment which clearly indicates whether a system scales or not or whether a system scales better than another. The key contribution of this article is the definition of a scalability index (SI) which quantifies if a system scales in comparison to another system, a hypothetical system, e.g., linear system, or the theoretically optimal system. The suggested SI generalizes different metrics from literature, which are specialized cases of our SI. The primary target of our scalability framework is, however, benchmarking of two systems, which does not require any reference system. The SI is demonstrated and evaluated for different use cases, that are (1) the performance of an IoT load balancer depending on the system load, (2) the availability of a communication system depending on the size and structure of the network, (3) scalability comparison of different location selection mechanisms in fog computing with respect to delays and energy consumption; (4) comparison of time-sensitive networking (TSN) mechanisms in terms of efficiency and utilization. Finally, we discuss how to use and how not to use the SI and give recommendations and guidelines in practice. To the best of our knowledge, this is the first work which provides a general SI for the comparison and benchmarking of systems, which is the primary target of our scalability analysis.
In recent history, normalized digital surface models (nDSMs) have been constantly gaining importance as a means to solve large-scale geographic problems. High-resolution surface models are precious, as they can provide detailed information for a specific area. However, measurements with a high resolution are time consuming and costly. Only a few approaches exist to create high-resolution nDSMs for extensive areas. This article explores approaches to extract high-resolution nDSMs from low-resolution Sentinel-2 data, allowing us to derive large-scale models. We thereby utilize the advantages of Sentinel 2 being open access, having global coverage, and providing steady updates through a high repetition rate. Several deep learning models are trained to overcome the gap in producing high-resolution surface maps from low-resolution input data. With U-Net as a base architecture, we extend the capabilities of our model by integrating tailored multiscale encoders with differently sized kernels in the convolution as well as conformed self-attention inside the skip connection gates. Using pixelwise regression, our U-Net base models can achieve a mean height error of approximately 2 m. Moreover, through our enhancements to the model architecture, we reduce the model error by more than 7%.
Background: Due to the importance of radiologic examinations, such as X-rays or computed tomography scans, for many clinical diagnoses, the optimal use of the radiology department is 1 of the primary goals of many hospitals.
Objective: This study aims to calculate the key metrics of this use by creating a radiology data warehouse solution, where data from radiology information systems (RISs) can be imported and then queried using a query language as well as a graphical user interface (GUI).
Methods: Using a simple configuration file, the developed system allowed for the processing of radiology data exported from any kind of RIS into a Microsoft Excel, comma-separated value (CSV), or JavaScript Object Notation (JSON) file. These data were then imported into a clinical data warehouse. Additional values based on the radiology data were calculated during this import process by implementing 1 of several provided interfaces. Afterward, the query language and GUI of the data warehouse were used to configure and calculate reports on these data. For the most common types of requested reports, a web interface was created to view their numbers as graphics.
Results: The tool was successfully tested with the data of 4 different German hospitals from 2018 to 2021, with a total of 1,436,111 examinations. The user feedback was good, since all their queries could be answered if the available data were sufficient. The initial processing of the radiology data for using them with the clinical data warehouse took (depending on the amount of data provided by each hospital) between 7 minutes and 1 hour 11 minutes. Calculating 3 reports of different complexities on the data of each hospital was possible in 1-3 seconds for reports with up to 200 individual calculations and in up to 1.5 minutes for reports with up to 8200 individual calculations.
Conclusions: A system was developed with the main advantage of being generic concerning the export of different RISs as well as concerning the configuration of queries for various reports. The queries could be configured easily using the GUI of the data warehouse, and their results could be exported into the standard formats Excel and CSV for further processing.
Group-based communication is a highly popular communication paradigm, which is especially prominent in mobile instant messaging (MIM) applications, such as WhatsApp. Chat groups in MIM applications facilitate the sharing of various types of messages (e.g., text, voice, image, video) among a large number of participants. As each message has to be transmitted to every other member of the group, which multiplies the traffic, this has a massive impact on the underlying communication networks. However, most chat groups are private and network operators cannot obtain deep insights into MIM communication via network measurements due to end-to-end encryption. Thus, the generation of traffic is not well understood, given that it depends on sizes of communication groups, speed of communication, and exchanged message types. In this work, we provide a huge data set of 5,956 private WhatsApp chat histories, which contains over 76 million messages from more than 117,000 users. We describe and model the properties of chat groups and users, and the communication within these chat groups, which gives unprecedented insights into private MIM communication. In addition, we conduct exemplary measurements for the most popular message types, which empower the provided models to estimate the traffic over time in a chat group.
Knowledge about ransomware is important for protecting sensitive data and for participating in public debates about suitable regulation regarding its security. However, as of now, this topic has received little to no attention in most school curricula. As such, it is desirable to analyze what citizens can learn about this topic outside of formal education, e.g., from news articles. This analysis is both relevant to analyzing the public discourse about ransomware, as well as to identify what aspects of this topic should be included in the limited time available for this topic in formal education. Thus, this paper was motivated both by educational and media research. The central goal is to explore how the media reports on this topic and, additionally, to identify potential misconceptions that could stem from this reporting. To do so, we conducted an exploratory case study into the reporting of 109 media articles regarding a high-impact ransomware event: the shutdown of the Colonial Pipeline (located in the east of the USA). We analyzed how the articles introduced central terminology, what details were provided, what details were not, and what (mis-)conceptions readers might receive from them. Our results show that an introduction of the terminology and technical concepts of security is insufficient for a complete understanding of the incident. Most importantly, the articles may lead to four misconceptions about ransomware that are likely to lead to misleading conclusions about the responsibility for the incident and possible political and technical options to prevent such attacks in the future.
Social patterns and roles can develop when users talk to intelligent voice assistants (IVAs) daily. The current study investigates whether users assign different roles to devices and how this affects their usage behavior, user experience, and social perceptions. Since social roles take time to establish, we equipped 106 participants with Alexa or Google assistants and some smart home devices and observed their interactions for nine months. We analyzed diverse subjective (questionnaire) and objective data (interaction data). By combining social science and data science analyses, we identified two distinct clusters—users who assigned a friendship role to IVAs over time and users who did not. Interestingly, these clusters exhibited significant differences in their usage behavior, user experience, and social perceptions of the devices. For example, participants who assigned a role to IVAs attributed more friendship to them used them more frequently, reported more enjoyment during interactions, and perceived more empathy for IVAs. In addition, these users had distinct personal requirements, for example, they reported more loneliness. This study provides valuable insights into the role-specific effects and consequences of voice assistants. Recent developments in conversational language models such as ChatGPT suggest that the findings of this study could make an important contribution to the design of dialogic human–AI interactions.
Digitization and transcription of historic documents offer new research opportunities for humanists and are the topics of many edition projects. However, manual work is still required for the main phases of layout recognition and the subsequent optical character recognition (OCR) of early printed documents. This paper describes and evaluates how deep learning approaches recognize text lines and can be extended to layout recognition using background knowledge. The evaluation was performed on five corpora of early prints from the 15th and 16th Centuries, representing a variety of layout features. While the main text with standard layouts could be recognized in the correct reading order with a precision and recall of up to 99.9%, also complex layouts were recognized at a rate as high as 90% by using background knowledge, the full potential of which was revealed if many pages of the same source were transcribed.
The ongoing digitization of historical photographs in archives allows investigating the quality, quantity, and distribution of these images. However, the exact interior and exterior camera orientations of these photographs are usually lost during the digitization process. The proposed method uses content-based image retrieval (CBIR) to filter exterior images of single buildings in combination with metadata information. The retrieved photographs are automatically processed in an adapted structure-from-motion (SfM) pipeline to determine the camera parameters. In an interactive georeferencing process, the calculated camera positions are transferred into a global coordinate system. As all image and camera data are efficiently stored in the proposed 4D database, they can be conveniently accessed afterward to georeference newly digitized images by using photogrammetric triangulation and spatial resection. The results show that the CBIR and the subsequent SfM are robust methods for various kinds of buildings and different quantity of data. The absolute accuracy of the camera positions after georeferencing lies in the range of a few meters likely introduced by the inaccurate LOD2 models used for transformation. The proposed photogrammetric method, the database structure, and the 4D visualization interface enable adding historical urban photographs and 3D models from other locations.
An important but very time consuming part of the research process is literature review. An already large and nevertheless growing ground set of publications as well as a steadily increasing publication rate continue to worsen the situation. Consequently, automating this task as far as possible is desirable. Experimental results of systems are key-insights of high importance during literature review and usually represented in form of tables. Our pipeline KIETA exploits these tables to contribute to the endeavor of automation by extracting them and their contained knowledge from scientific publications. The pipeline is split into multiple steps to guarantee modularity as well as analyzability, and agnosticim regarding the specific scientific domain up until the knowledge extraction step, which is based upon an ontology. Additionally, a dataset of corresponding articles has been manually annotated with information regarding table and knowledge extraction. Experiments show promising results that signal the possibility of an automated system, while also indicating limits of extracting knowledge from tables without any context.
Learning is a central component of human life and essential for personal development. Therefore, utilizing new technologies in the learning context and exploring their combined potential are considered essential to support self-directed learning in a digital age. A learning environment can be expanded by various technical and content-related aspects. Gamification in the form of elements from video games offers a potential concept to support the learning process. This can be supplemented by technology-supported learning. While the use of tablets is already widespread in the learning context, the integration of a social robot can provide new perspectives on the learning process. However, simply adding new technologies such as social robots or gamification to existing systems may not automatically result in a better learning environment. In the present study, game elements as well as a social robot were integrated separately and conjointly into a learning environment for basic Spanish skills, with a follow-up on retained knowledge. This allowed us to investigate the respective and combined effects of both expansions on motivation, engagement and learning effect. This approach should provide insights into the integration of both additions in an adult learning context. We found that the additions of game elements and the robot did not significantly improve learning, engagement or motivation. Based on these results and a literature review, we outline relevant factors for meaningful integration of gamification and social robots in learning environments in adult learning.
Climate models are the tool of choice for scientists researching climate change. Like all models they suffer from errors, particularly systematic and location-specific representation errors. One way to reduce these errors is model output statistics (MOS) where the model output is fitted to observational data with machine learning. In this work, we assess the use of convolutional Deep Learning climate MOS approaches and present the ConvMOS architecture which is specifically designed based on the observation that there are systematic and location-specific errors in the precipitation estimates of climate models. We apply ConvMOS models to the simulated precipitation of the regional climate model REMO, showing that a combination of per-location model parameters for reducing location-specific errors and global model parameters for reducing systematic errors is indeed beneficial for MOS performance. We find that ConvMOS models can reduce errors considerably and perform significantly better than three commonly used MOS approaches and plain ResNet and U-Net models in most cases. Our results show that non-linear MOS models underestimate the number of extreme precipitation events, which we alleviate by training models specialized towards extreme precipitation events with the imbalanced regression method DenseLoss. While we consider climate MOS, we argue that aspects of ConvMOS may also be beneficial in other domains with geospatial data, such as air pollution modeling or weather forecasts.
Die künstliche Intelligenz (KI) entwickelt sich rasant und hat bereits eindrucksvolle Erfolge zu verzeichnen, darunter übermenschliche Kompetenz in den meisten Spielen und vielen Quizshows, intelligente Suchmaschinen, individualisierte Werbung, Spracherkennung, -ausgabe und -übersetzung auf sehr hohem Niveau und hervorragende Leistungen bei der Bildverarbeitung, u. a. in der Medizin, der optischen Zeichenerkennung, beim autonomen Fahren, aber auch beim Erkennen von Menschen auf Bildern und Videos oder bei Deep Fakes für Fotos und Videos. Es ist zu erwarten, dass die KI auch in der Entscheidungsfindung Menschen übertreffen wird; ein alter Traum der Expertensysteme, der durch Lernverfahren, Big Data und Zugang zu dem gesammelten Wissen im Web in greifbare Nähe rückt. Gegenstand dieses Beitrags sind aber weniger die technischen Entwicklungen, sondern mögliche gesellschaftliche Auswirkungen einer spezialisierten, kompetenten KI für verschiedene Bereiche der autonomen, d. h. nicht nur unterstützenden Entscheidungsfindung: als Fußballschiedsrichter, in der Medizin, für richterliche Entscheidungen und sehr spekulativ auch im politischen Bereich. Dabei werden Vor- und Nachteile dieser Szenarien aus gesellschaftlicher Sicht diskutiert.
Crowdsourced network measurements (CNMs) are becoming increasingly popular as they assess the performance of a mobile network from the end user's perspective on a large scale. Here, network measurements are performed directly on the end-users' devices, thus taking advantage of the real-world conditions end-users encounter. However, this type of uncontrolled measurement raises questions about its validity and reliability. The problem lies in the nature of this type of data collection. In CNMs, mobile network subscribers are involved to a large extent in the measurement process, and collect data themselves for the operator. The collection of data on user devices in arbitrary locations and at uncontrolled times requires means to ensure validity and reliability. To address this issue, our paper defines concepts and guidelines for analyzing the precision of CNMs; specifically, the number of measurements required to make valid statements. In addition to the formal definition of the aspect, we illustrate the problem and use an extensive sample data set to show possible assessment approaches. This data set consists of more than 20.4 million crowdsourced mobile measurements from across France, measured by a commercial data provider.
Visual stimuli are frequently used to improve memory, language learning or perception, and understanding of metacognitive processes. However, in virtual reality (VR), there are few systematically and empirically derived databases. This paper proposes the first collection of virtual objects based on empirical evaluation for inter-and transcultural encounters between English- and German-speaking learners. We used explicit and implicit measurement methods to identify cultural associations and the degree of stereotypical perception for each virtual stimuli (n = 293) through two online studies, including native German and English-speaking participants. The analysis resulted in a final well-describable database of 128 objects (called InteractionSuitcase). In future applications, the objects can be used as a great interaction or conversation asset and behavioral measurement tool in social VR applications, especially in the field of foreign language education. For example, encounters can use the objects to describe their culture, or teachers can intuitively assess stereotyped attitudes of the encounters.
Heat and excessive solar radiation can produce abiotic stresses during apple maturation, resulting fruit quality. Therefore, the monitoring of temperature on fruit surface (FST) over the growing period can allow to identify thresholds, above of which several physiological disorders such as sunburn may occur in apple.
The current approaches neglect spatial variation of FST and have reduced repeatability, resulting in unreliable predictions. In this study, LiDAR laser scanning and thermal imaging were employed to detect the temperature on fruit surface by means of 3D point cloud. A process for calibrating the two sensors based on an active board target and producing a 3D thermal point cloud was suggested. After calibration, the sensor system was utilised to scan the fruit trees, while temperature values assigned in the corresponding 3D point cloud were based on the extrinsic calibration. Whereas a fruit detection algorithm was performed to segment the FST from each apple.
• The approach allows the calibration of LiDAR laser scanner with thermal camera in order to produce a 3D thermal point cloud.
• The method can be applied in apple trees for segmenting FST in 3D. Whereas the approach can be utilised to predict several physiological disorders including sunburn on fruit surface.
The strict restrictions introduced by the COVID-19 lockdowns, which started from March 2020, changed people’s daily lives and habits on many different levels. In this work, we investigate the impact of the lockdown on the communication behavior in the mobile instant messaging application WhatsApp. Our evaluations are based on a large dataset of 2577 private chat histories with 25,378,093 messages from 51,973 users. The analysis of the one-to-one and group conversations confirms that the lockdown severely altered the communication in WhatsApp chats compared to pre-pandemic time ranges. In particular, we observe short-term effects, which caused an increased message frequency in the first lockdown months and a shifted communication activity during the day in March and April 2020. Moreover, we also see long-term effects of the ongoing pandemic situation until February 2021, which indicate a change of communication behavior towards more regular messaging, as well as a persisting change in activity during the day. The results of our work show that even anonymized chat histories can tell us a lot about people’s behavior and especially behavioral changes during the COVID-19 pandemic and thus are of great relevance for behavioral researchers. Furthermore, looking at the pandemic from an Internet provider perspective, these insights can be used during the next pandemic, or if the current COVID-19 situation worsens, to adapt communication networks to the changed usage behavior early on and thus avoid network congestion.
This paper examines the relationship between time and motion perception in virtual environments. Previous work has shown that the perception of motion can affect the perception of time. We developed a virtual environment that simulates motion in a tunnel and measured its effects on the estimation of the duration of time, the speed at which perceived time passes, and the illusion of self-motion, also known as vection. When large areas of the visual field move in the same direction, vection can occur; observers often perceive this as self-motion rather than motion of the environment. To generate different levels of vection and investigate its effects on time perception, we developed an abstract procedural tunnel generator. The generator can simulate different speeds and densities of tunnel sections (visibly distinguishable sections that form the virtual tunnel), as well as the degree of embodiment of the user avatar (with or without virtual hands). We exposed participants to various tunnel simulations with different durations, speeds, and densities in a remote desktop and a virtual reality (VR) laboratory study. Time passed subjectively faster under high-speed and high-density conditions in both studies. The experience of self-motion was also stronger under high-speed and high-density conditions. Both studies revealed a significant correlation between the perceived passage of time and perceived self-motion. Subjects in the virtual reality study reported a stronger self-motion experience, a faster perceived passage of time, and shorter time estimates than subjects in the desktop study. Our results suggest that a virtual tunnel simulation can manipulate time perception in virtual reality. We will explore these results for the development of virtual reality applications for therapeutic approaches in our future work. This could be particularly useful in treating disorders like depression, autism, and schizophrenia, which are known to be associated with distortions in time perception. For example, the tunnel could be therapeutically applied by resetting patients’ time perceptions by exposing them to the tunnel under different conditions, such as increasing or decreasing perceived time.
Despite the fact that mixed-cultural backgrounds become of increasing importance in our daily life, the representation of multiple cultural backgrounds in one entity is still rare in socially interactive agents (SIAs). This paper’s contribution is twofold. First, it provides a survey of research on mixed-cultured SIAs. Second, it presents a study investigating how mixed-cultural speech (in this case, non-native accent) influences how a virtual robot is perceived in terms of personality, warmth, competence and credibility. Participants with English or German respectively as their first language watched a video of a virtual robot speaking in either standard English or German-accented English. It was expected that the German-accented speech would be rated more positively by native German participants as well as elicit the German stereotypes credibility and conscientiousness for both German and English participants. Contrary to the expectations, German participants rated the virtual robot lower in terms of competence and credibility when it spoke with a German accent, whereas English participants perceived the virtual robot with a German accent as more credible compared to the version without an accent. Both the native English and native German listeners classified the virtual robot with a German accent as significantly more neurotic than the virtual robot speaking standard English. This work shows that by solely implementing a non-native accent in a virtual robot, stereotypes are partly transferred. It also shows that the implementation of a non-native accent leads to differences in the perception of the virtual robot.
CLIP knows image aesthetics
(2022)
Most Image Aesthetic Assessment (IAA) methods use a pretrained ImageNet classification model as a base to fine-tune. We hypothesize that content classification is not an optimal pretraining task for IAA, since the task discourages the extraction of features that are useful for IAA, e.g., composition, lighting, or style. On the other hand, we argue that the Contrastive Language-Image Pretraining (CLIP) model is a better base for IAA models, since it has been trained using natural language supervision. Due to the rich nature of language, CLIP needs to learn a broad range of image features that correlate with sentences describing the image content, composition, environments, and even subjective feelings about the image. While it has been shown that CLIP extracts features useful for content classification tasks, its suitability for tasks that require the extraction of style-based features like IAA has not yet been shown. We test our hypothesis by conducting a three-step study, investigating the usefulness of features extracted by CLIP compared to features obtained from the last layer of a comparable ImageNet classification model. In each step, we get more computationally expensive. First, we engineer natural language prompts that let CLIP assess an image's aesthetic without adjusting any weights in the model. To overcome the challenge that CLIP's prompting only is applicable to classification tasks, we propose a simple but effective strategy to convert multiple prompts to a continuous scalar as required when predicting an image's mean aesthetic score. Second, we train a linear regression on the AVA dataset using image features obtained by CLIP's image encoder. The resulting model outperforms a linear regression trained on features from an ImageNet classification model. It also shows competitive performance with fully fine-tuned networks based on ImageNet, while only training a single layer. Finally, by fine-tuning CLIP's image encoder on the AVA dataset, we show that CLIP only needs a fraction of training epochs to converge, while also performing better than a fine-tuned ImageNet model. Overall, our experiments suggest that CLIP is better suited as a base model for IAA methods than ImageNet pretrained networks.
This article presents a novel method for controlling a virtual audience system (VAS) in Virtual Reality (VR) application, called STAGE, which has been originally designed for supervised public speaking training in university seminars dedicated to the preparation and delivery of scientific talks. We are interested in creating pedagogical narratives: narratives encompass affective phenomenon and rather than organizing events changing the course of a training scenario, pedagogical plans using our system focus on organizing the affects it arouses for the trainees. Efficiently controlling a virtual audience towards a specific training objective while evaluating the speaker’s performance presents a challenge for a seminar instructor: the high level of cognitive and physical demands required to be able to control the virtual audience, whilst evaluating speaker’s performance, adjusting and allowing it to quickly react to the user’s behaviors and interactions. It is indeed a critical limitation of a number of existing systems that they rely on a Wizard of Oz approach, where the tutor drives the audience in reaction to the user’s performance. We address this problem by integrating with a VAS a high-level control component for tutors, which allows using predefined audience behavior rules, defining custom ones, as well as intervening during run-time for finer control of the unfolding of the pedagogical plan. At its core, this component offers a tool to program, select, modify and monitor interactive training narratives using a high-level representation. The STAGE offers the following features: i) a high-level API to program pedagogical narratives focusing on a specific public speaking situation and training objectives, ii) an interactive visualization interface iii) computation and visualization of user metrics, iv) a semi-autonomous virtual audience composed of virtual spectators with automatic reactions to the speaker and surrounding spectators while following the pedagogical plan V) and the possibility for the instructor to embody a virtual spectator to ask questions or guide the speaker from within the Virtual Environment. We present here the design, and implementation of the tutoring system and its integration in STAGE, and discuss its reception by end-users.