004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (121)
Is part of the Bibliography
- yes (121)
Year of publication
Document Type
- Journal article (121) (remove)
Language
- English (121) (remove)
Keywords
- virtual reality (15)
- augmented reality (4)
- human-computer interaction (4)
- machine learning (4)
- crowdsensing (3)
- database (3)
- immersion (3)
- mHealth (3)
- neural networks (3)
- resistance (3)
Institute
- Institut für Informatik (72)
- Theodor-Boveri-Institut für Biowissenschaften (27)
- Institut Mensch - Computer - Medien (15)
- Institut für Klinische Epidemiologie und Biometrie (7)
- Center for Computational and Theoretical Biology (4)
- Institut für Funktionsmaterialien und Biofabrikation (2)
- Institut für Pharmazie und Lebensmittelchemie (2)
- Institut für Psychologie (2)
- Medizinische Klinik und Poliklinik II (2)
- Deutsches Zentrum für Herzinsuffizienz (DZHI) (1)
Sonstige beteiligte Institutionen
Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single genes classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single genes classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single genes classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single genes sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single genes classifiers for predicting outcome in breast cancer.
Background: Current imaging methods such as Magnetic Resonance Imaging (MRI), Confocal microscopy, Electron Microscopy (EM) or Selective Plane Illumination Microscopy (SPIM) yield three-dimensional (3D) data sets in need of appropriate computational methods for their analysis. The reconstruction, segmentation and registration are best approached from the 3D representation of the data set. Results: Here we present a platform-independent framework based on Java and Java 3D for accelerated rendering of biological images. Our framework is seamlessly integrated into ImageJ, a free image processing package with a vast collection of community-developed biological image analysis tools. Our framework enriches the ImageJ software libraries with methods that greatly reduce the complexity of developing image analysis tools in an interactive 3D visualization environment. In particular, we provide high-level access to volume rendering, volume editing, surface extraction, and image annotation. The ability to rely on a library that removes the low-level details enables concentrating software development efforts on the algorithm implementation parts. Conclusions: Our framework enables biomedical image software development to be built with 3D visualization capabilities with very little effort. We offer the source code and convenient binary packages along with extensive documentation at http://3dviewer.neurofly.de.
A key feature for Internet of Things (IoT) is to control what content is available to each user. To handle this access management, encryption schemes can be used. Due to the diverse usage of encryption schemes, there are various realizations of 1-to-1, 1-to-n, and n-to-n schemes in the literature. This multitude of encryption methods with a wide variety of properties presents developers with the challenge of selecting the optimal method for a particular use case, which is further complicated by the fact that there is no overview of existing encryption schemes. To fill this gap, we envision a cryptography encyclopedia providing such an overview of existing encryption schemes. In this survey paper, we take a first step towards such an encyclopedia by creating a sub-encyclopedia for secure group communication (SGC) schemes, which belong to the n-to-n category. We extensively surveyed the state-of-the-art and classified 47 different schemes. More precisely, we provide (i) a comprehensive overview of the relevant security features, (ii) a set of relevant performance metrics, (iii) a classification for secure group communication schemes, and (iv) workflow descriptions of the 47 schemes. Moreover, we perform a detailed performance and security evaluation of the 47 secure group communication schemes. Based on this evaluation, we create a guideline for the selection of secure group communication schemes.
This study provides a systematic literature review of research (2001–2020) in the field of teaching and learning a foreign language and intercultural learning using immersive technologies. Based on 2507 sources, 54 articles were selected according to a predefined selection criteria. The review is aimed at providing information about which immersive interventions are being used for foreign language learning and teaching and where potential research gaps exist. The papers were analyzed and coded according to the following categories: (1) investigation form and education level, (2) degree of immersion, and technology used, (3) predictors, and (4) criterions. The review identified key research findings relating the use of immersive technologies for learning and teaching a foreign language and intercultural learning at cognitive, affective, and conative levels. The findings revealed research gaps in the area of teachers as a target group, and virtual reality (VR) as a fully immersive intervention form. Furthermore, the studies reviewed rarely examined behavior, and implicit measurements related to inter- and trans-cultural learning and teaching. Inter- and transcultural learning and teaching especially is an underrepresented investigation subject. Finally, concrete suggestions for future research are given. The systematic review contributes to the challenge of interdisciplinary cooperation between pedagogy, foreign language didactics, and Human-Computer Interaction to achieve innovative teaching-learning formats and a successful digital transformation.
Measurements of physiological parameters provide an objective, often non-intrusive, and (at least semi-)automatic evaluation and utilization of user behavior. In addition, specific hardware devices of Virtual Reality (VR) often ship with built-in sensors, i.e. eye-tracking and movements sensors. Hence, the combination of physiological measurements and VR applications seems promising. Several approaches have investigated the applicability and benefits of this combination for various fields of applications. However, the range of possible application fields, coupled with potentially useful and beneficial physiological parameters, types of sensor, target variables and factors, and analysis approaches and techniques is manifold. This article provides a systematic overview and an extensive state-of-the-art review of the usage of physiological measurements in VR. We identified 1,119 works that make use of physiological measurements in VR. Within these, we identified 32 approaches that focus on the classification of characteristics of experience, common in VR applications. The first part of this review categorizes the 1,119 works by field of application, i.e. therapy, training, entertainment, and communication and interaction, as well as by the specific target factors and variables measured by the physiological parameters. An additional category summarizes general VR approaches applicable to all specific fields of application since they target typical VR qualities. In the second part of this review, we analyze the target factors and variables regarding the respective methods used for an automatic analysis and, potentially, classification. For example, we highlight which measurement setups have been proven to be sensitive enough to distinguish different levels of arousal, valence, anxiety, stress, or cognitive workload in the virtual realm. This work may prove useful for all researchers wanting to use physiological data in VR and who want to have a good overview of prior approaches taken, their benefits and potential drawbacks.
Failure prediction is an important aspect of self-aware computing systems. Therefore, a multitude of different approaches has been proposed in the literature over the past few years. In this work, we propose a taxonomy for organizing works focusing on the prediction of Service Level Objective (SLO) failures. Our taxonomy classifies related work along the dimensions of the prediction target (e.g., anomaly detection, performance prediction, or failure prediction), the time horizon (e.g., detection or prediction, online or offline application), and the applied modeling type (e.g., time series forecasting, machine learning, or queueing theory). The classification is derived based on a systematic mapping of relevant papers in the area. Additionally, we give an overview of different techniques in each sub-group and address remaining challenges in order to guide future research.
Realistic and lifelike 3D-reconstruction of virtual humans has various exciting and important use cases. Our and others’ appearances have notable effects on ourselves and our interaction partners in virtual environments, e.g., on acceptance, preference, trust, believability, behavior (the Proteus effect), and more. Today, multiple approaches for the 3D-reconstruction of virtual humans exist. They significantly vary in terms of the degree of achievable realism, the technical complexities, and finally, the overall reconstruction costs involved. This article compares two 3D-reconstruction approaches with very different hardware requirements. The high-cost solution uses a typical complex and elaborated camera rig consisting of 94 digital single-lens reflex (DSLR) cameras. The recently developed low-cost solution uses a smartphone camera to create videos that capture multiple views of a person. Both methods use photogrammetric reconstruction and template fitting with the same template model and differ in their adaptation to the method-specific input material. Each method generates high-quality virtual humans ready to be processed, animated, and rendered by standard XR simulation and game engines such as Unreal or Unity. We compare the results of the two 3D-reconstruction methods in an immersive virtual environment against each other in a user study. Our results indicate that the virtual humans from the low-cost approach are perceived similarly to those from the high-cost approach regarding the perceived similarity to the original, human-likeness, beauty, and uncanniness, despite significant differences in the objectively measured quality. The perceived feeling of change of the own body was higher for the low-cost virtual humans. Quality differences were perceived more strongly for one’s own body than for other virtual humans.
Ambalytics: a scalable and distributed system architecture concept for bibliometric network analyses
(2021)
A deep understanding about a field of research is valuable for academic researchers. In addition to technical knowledge, this includes knowledge about subareas, open research questions, and social communities (networks) of individuals and organizations within a given field. With bibliometric analyses, researchers can acquire quantitatively valuable knowledge about a research area by using bibliographic information on academic publications provided by bibliographic data providers. Bibliometric analyses include the calculation of bibliometric networks to describe affiliations or similarities of bibliometric entities (e.g., authors) and group them into clusters representing subareas or communities. Calculating and visualizing bibliometric networks is a nontrivial and time-consuming data science task that requires highly skilled individuals. In addition to domain knowledge, researchers must often provide statistical knowledge and programming skills or use software tools having limited functionality and usability. In this paper, we present the ambalytics bibliometric platform, which reduces the complexity of bibliometric network analysis and the visualization of results. It accompanies users through the process of bibliometric analysis and eliminates the need for individuals to have programming skills and statistical knowledge, while preserving advanced functionality, such as algorithm parameterization, for experts. As a proof-of-concept, and as an example of bibliometric analyses outcomes, the calculation of research fronts networks based on a hybrid similarity approach is shown. Being designed to scale, ambalytics makes use of distributed systems concepts and technologies. It is based on the microservice architecture concept and uses the Kubernetes framework for orchestration. This paper presents the initial building block of a comprehensive bibliometric analysis platform called ambalytics, which aims at a high usability for users as well as scalability.
Heat and excessive solar radiation can produce abiotic stresses during apple maturation, resulting fruit quality. Therefore, the monitoring of temperature on fruit surface (FST) over the growing period can allow to identify thresholds, above of which several physiological disorders such as sunburn may occur in apple.
The current approaches neglect spatial variation of FST and have reduced repeatability, resulting in unreliable predictions. In this study, LiDAR laser scanning and thermal imaging were employed to detect the temperature on fruit surface by means of 3D point cloud. A process for calibrating the two sensors based on an active board target and producing a 3D thermal point cloud was suggested. After calibration, the sensor system was utilised to scan the fruit trees, while temperature values assigned in the corresponding 3D point cloud were based on the extrinsic calibration. Whereas a fruit detection algorithm was performed to segment the FST from each apple.
• The approach allows the calibration of LiDAR laser scanner with thermal camera in order to produce a 3D thermal point cloud.
• The method can be applied in apple trees for segmenting FST in 3D. Whereas the approach can be utilised to predict several physiological disorders including sunburn on fruit surface.
A procedure to control all six DOF (degrees of freedom) of a UAV (unmanned aerial vehicle) without an external reference system and to enable fully autonomous flight is presented here. For 2D positioning the principle of optical flow is used. Together with the output of height estimation, fusing ultrasonic, infrared and inertial and pressure sensor data, the 3D position of the UAV can be computed, controlled and steered. All data processing is done on the UAV. An external computer with a pathway planning interface is for commanding purposes only. The presented system is part of the AQopterI8 project, which aims to develop an autonomous flying quadrocopter for indoor application. The focus of this paper is 2D positioning using an optical flow sensor. As a result of the performed evaluation, it can be concluded that for position hold, the standard deviation of the position error is 10cm and after landing the position error is about 30cm.
Knowledge about ransomware is important for protecting sensitive data and for participating in public debates about suitable regulation regarding its security. However, as of now, this topic has received little to no attention in most school curricula. As such, it is desirable to analyze what citizens can learn about this topic outside of formal education, e.g., from news articles. This analysis is both relevant to analyzing the public discourse about ransomware, as well as to identify what aspects of this topic should be included in the limited time available for this topic in formal education. Thus, this paper was motivated both by educational and media research. The central goal is to explore how the media reports on this topic and, additionally, to identify potential misconceptions that could stem from this reporting. To do so, we conducted an exploratory case study into the reporting of 109 media articles regarding a high-impact ransomware event: the shutdown of the Colonial Pipeline (located in the east of the USA). We analyzed how the articles introduced central terminology, what details were provided, what details were not, and what (mis-)conceptions readers might receive from them. Our results show that an introduction of the terminology and technical concepts of security is insufficient for a complete understanding of the incident. Most importantly, the articles may lead to four misconceptions about ransomware that are likely to lead to misleading conclusions about the responsibility for the incident and possible political and technical options to prevent such attacks in the future.
BACKGROUND: In the face of growing resistance in malaria parasites to drugs, pharmacological combination therapies are important. There is accumulating evidence that methylene blue (MB) is an effective drug against malaria. Here we explore the biological effects of both MB alone and in combination therapy using modeling and experimental data.
RESULTS: We built a model of the central metabolic pathways in P. falciparum. Metabolic flux modes and their changes under MB were calculated by integrating experimental data (RT-PCR data on mRNAs for redox enzymes) as constraints and results from the YANA software package for metabolic pathway calculations. Several different lines of MB attack on Plasmodium redox defense were identified by analysis of the network effects. Next, chloroquine resistance based on pfmdr/and pfcrt transporters, as well as pyrimethamine/sulfadoxine resistance (by mutations in DHF/DHPS), were modeled in silico. Further modeling shows that MB has a favorable synergism on antimalarial network effects with these commonly used antimalarial drugs.
CONCLUSIONS: Theoretical and experimental results support that methylene blue should, because of its resistance-breaking potential, be further tested as a key component in drug combination therapy efforts in holoendemic areas.
In this paper we study connectivity augmentation problems. Given a connected graph G with some desirable property, we want to make G 2-vertex connected (or 2-edge connected) by adding edges such that the resulting graph keeps the property. The aim is to add as few edges as possible. The property that we consider is planarity, both in an abstract graph-theoretic and in a geometric setting, where vertices correspond to points in the plane and edges to straight-line segments.
We show that it is NP-hard to nd a minimum-cardinality augmentation that makes a planar graph 2-edge connected. For making a planar graph 2-vertex connected this was known. We further show that both problems are hard in the geometric setting, even when restricted to trees. The problems remain hard for higher degrees of connectivity. On the other hand we give polynomial-time algorithms for the special case of convex geometric graphs.
We also study the following related problem. Given a planar (plane geometric) graph G, two vertices s and t of G, and an integer c, how many edges have to be added to G such that G is still planar (plane geometric) and contains c edge- (or vertex-) disjoint s{t paths? For the planar case we give a linear-time algorithm for c = 2. For the plane geometric case we give optimal worst-case bounds for c = 2; for c = 3 we characterize the cases that have a solution.
The development of ICT infrastructures has facilitated the emergence of new paradigms for looking at society and the environment over the last few years. Participatory environmental sensing, i.e. directly involving citizens in environmental monitoring, is one example, which is hoped to encourage learning and enhance awareness of environmental issues. In this paper, an analysis of the behaviour of individuals involved in noise sensing is presented. Citizens have been involved in noise measuring activities through the WideNoise smartphone application. This application has been designed to record both objective (noise samples) and subjective (opinions, feelings) data. The application has been open to be used freely by anyone and has been widely employed worldwide. In addition, several test cases have been organised in European countries. Based on the information submitted by users, an analysis of emerging awareness and learning is performed. The data show that changes in the way the environment is perceived after repeated usage of the application do appear. Specifically, users learn how to recognise different noise levels they are exposed to. Additionally, the subjective data collected indicate an increased user involvement in time and a categorisation effect between pleasant and less pleasant environments.
The design and evaluation of assisting technologies to support behavior change processes have become an essential topic within the field of human-computer interaction research in general and the field of immersive intervention technologies in particular. The mechanisms and success of behavior change techniques and interventions are broadly investigated in the field of psychology. However, it is not always easy to adapt these psychological findings to the context of immersive technologies. The lack of theoretical foundation also leads to a lack of explanation as to why and how immersive interventions support behavior change processes. The Behavioral Framework for immersive Technologies (BehaveFIT) addresses this lack by 1) presenting an intelligible categorization and condensation of psychological barriers and immersive features, by 2) suggesting a mapping that shows why and how immersive technologies can help to overcome barriers and finally by 3) proposing a generic prediction path that enables a structured, theory-based approach to the development and evaluation of immersive interventions. These three steps explain how BehaveFIT can be used, and include guiding questions for each step. Further, two use cases illustrate the usage of BehaveFIT. Thus, the present paper contributes to guidance for immersive intervention design and evaluation, showing that immersive interventions support behavior change processes and explain and predict 'why' and 'how' immersive interventions can bridge the intention-behavior-gap.
This paper describes the estimation of the body weight of a person in front of an RGB-D camera. A survey of different methods for body weight estimation based on depth sensors is given. First, an estimation of people standing in front of a camera is presented. Second, an approach based on a stream of depth images is used to obtain the body weight of a person walking towards a sensor. The algorithm first extracts features from a point cloud and forwards them to an artificial neural network (ANN) to obtain an estimation of body weight. Besides the algorithm for the estimation, this paper further presents an open-access dataset based on measurements from a trauma room in a hospital as well as data from visitors of a public event. In total, the dataset contains 439 measurements. The article illustrates the efficiency of the approach with experiments with persons lying down in a hospital, standing persons, and walking persons. Applicable scenarios for the presented algorithm are body weight-related dosing of emergency patients.
The Internet of Things (IoT) enables a variety of smart applications, including smart home, smart manufacturing, and smart city. By enhancing Business Process Management Systems with IoT capabilities, the execution and monitoring of business processes can be significantly improved. Providing a holistic support for modeling, executing and monitoring IoT-driven processes, however, constitutes a challenge. Existing process modeling and process execution languages, such as BPMN 2.0, are unable to fully meet the IoT characteristics (e.g., asynchronicity and parallelism) of IoT-driven processes. In this article, we present BPMNE4IoT—A holistic framework for modeling, executing and monitoring IoT-driven processes. We introduce various artifacts and events based on the BPMN 2.0 metamodel that allow realizing the desired IoT awareness of business processes. The framework is evaluated along two real-world scenarios from two different domains. Moreover, we present a user study for comparing BPMNE4IoT and BPMN 2.0. In particular, this study has confirmed that the BPMNE4IoT framework facilitates the support of IoT-driven processes.
This article presents an immersive virtual reality (VR) system for training classroom management skills, with a specific focus on learning to manage disruptive student behavior in face-to-face, one-to-many teaching scenarios. The core of the system is a real-time 3D virtual simulation of a classroom populated by twenty-four semi-autonomous virtual students. The system has been designed as a companion tool for classroom management seminars in a syllabus for primary and secondary school teachers. This will allow lecturers to link theory with practice using the medium of VR. The system is therefore designed for two users: a trainee teacher and an instructor supervising the training session. The teacher is immersed in a real-time 3D simulation of a classroom by means of a head-mounted display and headphone. The instructor operates a graphical desktop console, which renders a view of the class and the teacher whose avatar movements are captured by a marker less tracking system. This console includes a 2D graphics menu with convenient behavior and feedback control mechanisms to provide human-guided training sessions. The system is built using low-cost consumer hardware and software. Its architecture and technical design are described in detail. A first evaluation confirms its conformance to critical usability requirements (i.e., safety and comfort, believability, simplicity, acceptability, extensibility, affordability, and mobility). Our initial results are promising and constitute the necessary first step toward a possible investigation of the efficiency and effectiveness of such a system in terms of learning outcomes and experience.
Mindfulness is considered an important factor of an individual's subjective well-being. Consequently, Human-Computer Interaction (HCI) has investigated approaches that strengthen mindfulness, i.e., by inventing multimedia technologies to support mindfulness meditation. These approaches often use smartphones, tablets, or consumer-grade desktop systems to allow everyday usage in users' private lives or in the scope of organized therapies. Virtual, Augmented, and Mixed Reality (VR, AR, MR; in short: XR) significantly extend the design space for such approaches. XR covers a wide range of potential sensory stimulation, perceptive and cognitive manipulations, content presentation, interaction, and agency. These facilities are linked to typical XR-specific perceptions that are conceptually closely related to mindfulness research, such as (virtual) presence and (virtual) embodiment. However, a successful exploitation of XR that strengthens mindfulness requires a systematic analysis of the potential interrelation and influencing mechanisms between XR technology, its properties, factors, and phenomena and existing models and theories of the construct of mindfulness. This article reports such a systematic analysis of XR-related research from HCI and life sciences to determine the extent to which existing research frameworks on HCI and mindfulness can be applied to XR technologies, the potential of XR technologies to support mindfulness, and open research gaps. Fifty papers of ACM Digital Library and National Institutes of Health's National Library of Medicine (PubMed) with and without empirical efficacy evaluation were included in our analysis. The results reveal that at the current time, empirical research on XR-based mindfulness support mainly focuses on therapy and therapeutic outcomes. Furthermore, most of the currently investigated XR-supported mindfulness interactions are limited to vocally guided meditations within nature-inspired virtual environments. While an analysis of empirical research on those systems did not reveal differences in mindfulness compared to non-mediated mindfulness practices, various design proposals illustrate that XR has the potential to provide interactive and body-based innovations for mindfulness practice. We propose a structured approach for future work to specify and further explore the potential of XR as mindfulness-support. The resulting framework provides design guidelines for XR-based mindfulness support based on the elements and psychological mechanisms of XR interactions.
The charged aerosol detector (CAD) is the latest representative of aerosol-based detectors that generate a response independent of the analytes' chemical structure. This study was aimed at accurately predicting the CAD response of homologous fatty acids under varying experimental conditions. Fatty acids from C12 to C18 were used as model substances due to semivolatile characterics that caused non-uniform CAD behaviour. Considering both experimental conditions and molecular descriptors, a mixed quantitative structure-property relationship (QSPR) modeling was performed using Gradient Boosted Trees (GBT). The ensemble of 10 decisions trees (learning rate set at 0.55, the maximal depth set at 5, and the sample rate set at 1.0) was able to explain approximately 99% (Q\(^2\): 0.987, RMSE: 0.051) of the observed variance in CAD responses. Validation using an external test compound confirmed the high predictive ability of the model established (R-2: 0.990, RMSEP: 0.050). With respect to the intrinsic attribute selection strategy, GBT used almost all independent variables during model building. Finally, it attributed the highest importance to the power function value, the flow rate of the mobile phase, evaporation temperature, the content of the organic solvent in the mobile phase and the molecular descriptors such as molecular weight (MW), Radial Distribution Function-080/weighted by mass (RDF080m) and average coefficient of the last eigenvector from distance/detour matrix (Ve2_D/Dt). The identification of the factors most relevant to the CAD responsiveness has contributed to a better understanding of the underlying mechanisms of signal generation. An increased CAD response that was obtained for acetone as organic modifier demonstrated its potential to replace the more expensive and environmentally harmful acetonitrile.
CLIP knows image aesthetics
(2022)
Most Image Aesthetic Assessment (IAA) methods use a pretrained ImageNet classification model as a base to fine-tune. We hypothesize that content classification is not an optimal pretraining task for IAA, since the task discourages the extraction of features that are useful for IAA, e.g., composition, lighting, or style. On the other hand, we argue that the Contrastive Language-Image Pretraining (CLIP) model is a better base for IAA models, since it has been trained using natural language supervision. Due to the rich nature of language, CLIP needs to learn a broad range of image features that correlate with sentences describing the image content, composition, environments, and even subjective feelings about the image. While it has been shown that CLIP extracts features useful for content classification tasks, its suitability for tasks that require the extraction of style-based features like IAA has not yet been shown. We test our hypothesis by conducting a three-step study, investigating the usefulness of features extracted by CLIP compared to features obtained from the last layer of a comparable ImageNet classification model. In each step, we get more computationally expensive. First, we engineer natural language prompts that let CLIP assess an image's aesthetic without adjusting any weights in the model. To overcome the challenge that CLIP's prompting only is applicable to classification tasks, we propose a simple but effective strategy to convert multiple prompts to a continuous scalar as required when predicting an image's mean aesthetic score. Second, we train a linear regression on the AVA dataset using image features obtained by CLIP's image encoder. The resulting model outperforms a linear regression trained on features from an ImageNet classification model. It also shows competitive performance with fully fine-tuned networks based on ImageNet, while only training a single layer. Finally, by fine-tuning CLIP's image encoder on the AVA dataset, we show that CLIP only needs a fraction of training epochs to converge, while also performing better than a fine-tuned ImageNet model. Overall, our experiments suggest that CLIP is better suited as a base model for IAA methods than ImageNet pretrained networks.
Tardigrades have fascinated researchers for more than 300 years because of their extraordinary capability to undergo cryptobiosis and survive extreme environmental conditions. However, the survival mechanisms of tardigrades are still poorly understood mainly due to the absence of detailed knowledge about the proteome and genome of these organisms. Our study was intended to provide a basis for the functional characterization of expressed proteins in different states of tardigrades. High-throughput, high-accuracy proteomics in combination with a newly developed tardigrade specific protein database resulted in the identification of more than 3000 proteins in three different states: early embryonic state and adult animals in active and anhydrobiotic state. This comprehensive proteome resource includes protein families such as chaperones, antioxidants, ribosomal proteins, cytoskeletal proteins, transporters, protein channels, nutrient reservoirs, and developmental proteins. A comparative analysis of protein families in the different states was performed by calculating the exponentially modified protein abundance index which classifies proteins in major and minor components. This is the first step to analyzing the proteins involved in early embryonic development, and furthermore proteins which might play an important role in the transition into the anhydrobiotic state.
Scalability is often mentioned in literature, but a stringent definition is missing. In particular, there is no general scalability assessment which clearly indicates whether a system scales or not or whether a system scales better than another. The key contribution of this article is the definition of a scalability index (SI) which quantifies if a system scales in comparison to another system, a hypothetical system, e.g., linear system, or the theoretically optimal system. The suggested SI generalizes different metrics from literature, which are specialized cases of our SI. The primary target of our scalability framework is, however, benchmarking of two systems, which does not require any reference system. The SI is demonstrated and evaluated for different use cases, that are (1) the performance of an IoT load balancer depending on the system load, (2) the availability of a communication system depending on the size and structure of the network, (3) scalability comparison of different location selection mechanisms in fog computing with respect to delays and energy consumption; (4) comparison of time-sensitive networking (TSN) mechanisms in terms of efficiency and utilization. Finally, we discuss how to use and how not to use the SI and give recommendations and guidelines in practice. To the best of our knowledge, this is the first work which provides a general SI for the comparison and benchmarking of systems, which is the primary target of our scalability analysis.
To enable a sustainable supply of chemicals, novel biotechnological solutions are required that replace the reliance on fossil resources. One potential solution is to utilize tailored biosynthetic modules for the metabolic conversion of CO2 or organic waste to chemicals and fuel by microorganisms. Currently, it is challenging to commercialize biotechnological processes for renewable chemical biomanufacturing because of a lack of highly active and specific biocatalysts. As experimental methods to engineer biocatalysts are time- and cost-intensive, it is important to establish efficient and reliable computational tools that can speed up the identification or optimization of selective, highly active, and stable enzyme variants for utilization in the biotechnological industry. Here, we review and suggest combinations of effective state-of-the-art software and online tools available for computational enzyme engineering pipelines to optimize metabolic pathways for the biosynthesis of renewable chemicals. Using examples relevant for biotechnology, we explain the underlying principles of enzyme engineering and design and illuminate future directions for automated optimization of biocatalysts for the assembly of synthetic metabolic pathways.
Presence is often considered the most important quale describing the subjective feeling of being in a computer-generated and/or computer-mediated virtual environment. The identification and separation of orthogonal presence components, i.e., the place illusion and the plausibility illusion, has been an accepted theoretical model describing Virtual Reality (VR) experiences for some time. This perspective article challenges this presence-oriented VR theory. First, we argue that a place illusion cannot be the major construct to describe the much wider scope of virtual, augmented, and mixed reality (VR, AR, MR: or XR for short). Second, we argue that there is no plausibility illusion but merely plausibility, and we derive the place illusion caused by the congruent and plausible generation of spatial cues and similarly for all the current model’s so-defined illusions. Finally, we propose congruence and plausibility to become the central essential conditions in a novel theoretical model describing XR experiences and effects.
This article presents a novel method for controlling a virtual audience system (VAS) in Virtual Reality (VR) application, called STAGE, which has been originally designed for supervised public speaking training in university seminars dedicated to the preparation and delivery of scientific talks. We are interested in creating pedagogical narratives: narratives encompass affective phenomenon and rather than organizing events changing the course of a training scenario, pedagogical plans using our system focus on organizing the affects it arouses for the trainees. Efficiently controlling a virtual audience towards a specific training objective while evaluating the speaker’s performance presents a challenge for a seminar instructor: the high level of cognitive and physical demands required to be able to control the virtual audience, whilst evaluating speaker’s performance, adjusting and allowing it to quickly react to the user’s behaviors and interactions. It is indeed a critical limitation of a number of existing systems that they rely on a Wizard of Oz approach, where the tutor drives the audience in reaction to the user’s performance. We address this problem by integrating with a VAS a high-level control component for tutors, which allows using predefined audience behavior rules, defining custom ones, as well as intervening during run-time for finer control of the unfolding of the pedagogical plan. At its core, this component offers a tool to program, select, modify and monitor interactive training narratives using a high-level representation. The STAGE offers the following features: i) a high-level API to program pedagogical narratives focusing on a specific public speaking situation and training objectives, ii) an interactive visualization interface iii) computation and visualization of user metrics, iv) a semi-autonomous virtual audience composed of virtual spectators with automatic reactions to the speaker and surrounding spectators while following the pedagogical plan V) and the possibility for the instructor to embody a virtual spectator to ask questions or guide the speaker from within the Virtual Environment. We present here the design, and implementation of the tutoring system and its integration in STAGE, and discuss its reception by end-users.
Climate models are the tool of choice for scientists researching climate change. Like all models they suffer from errors, particularly systematic and location-specific representation errors. One way to reduce these errors is model output statistics (MOS) where the model output is fitted to observational data with machine learning. In this work, we assess the use of convolutional Deep Learning climate MOS approaches and present the ConvMOS architecture which is specifically designed based on the observation that there are systematic and location-specific errors in the precipitation estimates of climate models. We apply ConvMOS models to the simulated precipitation of the regional climate model REMO, showing that a combination of per-location model parameters for reducing location-specific errors and global model parameters for reducing systematic errors is indeed beneficial for MOS performance. We find that ConvMOS models can reduce errors considerably and perform significantly better than three commonly used MOS approaches and plain ResNet and U-Net models in most cases. Our results show that non-linear MOS models underestimate the number of extreme precipitation events, which we alleviate by training models specialized towards extreme precipitation events with the imbalanced regression method DenseLoss. While we consider climate MOS, we argue that aspects of ConvMOS may also be beneficial in other domains with geospatial data, such as air pollution modeling or weather forecasts.
Mapping and localization of mobile robots in an unknown environment are essential for most high-level operations like autonomous navigation or exploration. This paper presents a novel approach for combining estimated trajectories, namely curvefusion. The robot used in the experiments is equipped with a horizontally mounted 2D profiler, a constantly spinning 3D laser scanner and a GPS module. The proposed algorithm first combines trajectories from different sensors to optimize poses of the planar three degrees of freedom (DoF) trajectory, which is then fed into continuous-time simultaneous localization and mapping (SLAM) to further improve the trajectory. While state-of-the-art multi-sensor fusion methods mainly focus on probabilistic methods, our approach instead adopts a deformation-based method to optimize poses. To this end, a similarity metric for curved shapes is introduced into the robotics community to fuse the estimated trajectories. Additionally, a shape-based point correspondence estimation method is applied to the multi-sensor time calibration. Experiments show that the proposed fusion method can achieve relatively better accuracy, even if the error of the trajectory before fusion is large, which demonstrates that our method can still maintain a certain degree of accuracy in an environment where typical pose estimation methods have poor performance. In addition, the proposed time-calibration method also achieves high accuracy in estimating point correspondences.
Lifetime techniques are applied to diverse fields of study including materials sciences, semiconductor physics, biology, molecular biophysics and photochemistry.
Here we present DDRS4PALS, a software for the acquisition and simulation of lifetime spectra using the DRS4 evaluation board (Paul Scherrer Institute, Switzerland) for time resolved measurements and digitization of detector output pulses. Artifact afflicted pulses can be corrected or rejected prior to the lifetime calculation to provide the generation of high-quality lifetime spectra, which are crucial for a profound analysis, i.e. the decomposition of the true information. Moreover, the pulses can be streamed on an (external) hard drive during the measurement and subsequently downloaded in the offline mode without being connected to the hardware. This allows the generation of various lifetime spectra at different configurations from one single measurement and, hence, a meaningful comparison in terms of analyzability and quality. Parallel processing and an integrated JavaScript based language provide convenient options to accelerate and automate time consuming processes such as lifetime spectra simulations.
An innovative framework has been developed for teamwork of two quadcopter formations, each having its specified formation geometry, assigned task, and matching control scheme. Position control for quadcopters in one of the formations has been implemented through a Linear Quadratic Regulator Proportional Integral (LQR PI) control scheme based on explicit model following scheme. Quadcopters in the other formation are controlled through LQR PI servomechanism control scheme. These two control schemes are compared in terms of their performance and control effort. Both formations are commanded by respective ground stations through virtual leaders. Quadcopters in formations are able to track desired trajectories as well as hovering at desired points for selected time duration. In case of communication loss between ground station and any of the quadcopters, the neighboring quadcopter provides the command data, received from the ground station, to the affected unit. Proposed control schemes have been validated through extensive simulations using MATLAB®/Simulink® that provided favorable results.
In recent history, normalized digital surface models (nDSMs) have been constantly gaining importance as a means to solve large-scale geographic problems. High-resolution surface models are precious, as they can provide detailed information for a specific area. However, measurements with a high resolution are time consuming and costly. Only a few approaches exist to create high-resolution nDSMs for extensive areas. This article explores approaches to extract high-resolution nDSMs from low-resolution Sentinel-2 data, allowing us to derive large-scale models. We thereby utilize the advantages of Sentinel 2 being open access, having global coverage, and providing steady updates through a high repetition rate. Several deep learning models are trained to overcome the gap in producing high-resolution surface maps from low-resolution input data. With U-Net as a base architecture, we extend the capabilities of our model by integrating tailored multiscale encoders with differently sized kernels in the convolution as well as conformed self-attention inside the skip connection gates. Using pixelwise regression, our U-Net base models can achieve a mean height error of approximately 2 m. Moreover, through our enhancements to the model architecture, we reduce the model error by more than 7%.
To deliver the best user experience (UX), the human-centered design cycle (HCDC) serves as a well-established guideline to application developers. However, it does not yet cover network-specific requirements, which become increasingly crucial, as most applications deliver experience over the Internet. The missing network-centric view is provided by Quality of Experience (QoE), which could team up with UX towards an improved overall experience. By considering QoE aspects during the development process, it can be achieved that applications become network-aware by design. In this paper, the Quality of Experience Centered Design Cycle (QoE-CDC) is proposed, which provides guidelines on how to design applications with respect to network-specific requirements and QoE. Its practical value is showcased for popular application types and validated by outlining the design of a new smartphone application. We show that combining HCDC and QoE-CDC will result in an application design, which reaches a high UX and avoids QoE degradation.
In many real world settings, imbalanced data impedes model performance of learning algorithms, like neural networks, mostly for rare cases. This is especially problematic for tasks focusing on these rare occurrences. For example, when estimating precipitation, extreme rainfall events are scarce but important considering their potential consequences. While there are numerous well studied solutions for classification settings, most of them cannot be applied to regression easily. Of the few solutions for regression tasks, barely any have explored cost-sensitive learning which is known to have advantages compared to sampling-based methods in classification tasks. In this work, we propose a sample weighting approach for imbalanced regression datasets called DenseWeight and a cost-sensitive learning approach for neural network regression with imbalanced data called DenseLoss based on our weighting scheme. DenseWeight weights data points according to their target value rarities through kernel density estimation (KDE). DenseLoss adjusts each data point’s influence on the loss according to DenseWeight, giving rare data points more influence on model training compared to common data points. We show on multiple differently distributed datasets that DenseLoss significantly improves model performance for rare data points through its density-based weighting scheme. Additionally, we compare DenseLoss to the state-of-the-art method SMOGN, finding that our method mostly yields better performance. Our approach provides more control over model training as it enables us to actively decide on the trade-off between focusing on common or rare cases through a single hyperparameter, allowing the training of better models for rare data points.
The concept of digital literacy has been introduced as a new cultural technique, which is regarded as essential for successful participation in a (future) digitized world. Regarding the increasing importance of AI, literacy concepts need to be extended to account for AI-related specifics. The easy handling of the systems results in increased usage, contrasting limited conceptualizations (e.g., imagination of future importance) and competencies (e.g., knowledge about functional principles). In reference to voice-based conversational agents as a concrete application of AI, the present paper aims for the development of a measurement to assess the conceptualizations and competencies about conversational agents. In a first step, a theoretical framework of “AI literacy” is transferred to the context of conversational agent literacy. Second, the “conversational agent literacy scale” (short CALS) is developed, constituting the first attempt to measure interindividual differences in the “(il) literate” usage of conversational agents. 29 items were derived, of which 170 participants answered. An explanatory factor analysis identified five factors leading to five subscales to assess CAL: storage and transfer of the smart speaker’s data input; smart speaker’s functional principles; smart speaker’s intelligent functions, learning abilities; smart speaker’s reach and potential; smart speaker’s technological (surrounding) infrastructure. Preliminary insights into construct validity and reliability of CALS showed satisfying results. Third, using the newly developed instrument, a student sample’s CAL was assessed, revealing intermediated values. Remarkably, owning a smart speaker did not lead to higher CAL scores, confirming our basic assumption that usage of systems does not guarantee enlightened conceptualizations and competencies. In sum, the paper contributes to the first insights into the operationalization and understanding of CAL as a specific subdomain of AI-related competencies.
Two-component systems (TCS) are short signalling pathways generally occurring in prokaryotes. They frequently regulate prokaryotic stimulus responses and thus are also of interest for engineering in biotechnology and synthetic biology. The aim of this study is to better understand and describe rewiring of TCS while investigating different evolutionary scenarios. Based on large-scale screens of TCS in different organisms, this study gives detailed data, concrete alignments, and structure analysis on three general modification scenarios, where TCS were rewired for new responses and functions: (i) exchanges in the sequence within single TCS domains, (ii) exchange of whole TCS domains; (iii) addition of new components modulating TCS function. As a result, the replacement of stimulus and promotor cassettes to rewire TCS is well defined exploiting the alignments given here. The diverged TCS examples are non-trivial and the design is challenging. Designed connector proteins may also be useful to modify TCS in selected cases.
DLTPulseGenerator: a library for the simulation of lifetime spectra based on detector-output pulses
(2018)
The quantitative analysis of lifetime spectra relevant in both life and materials sciences presents one of the ill-posed inverse problems and, hence, leads to most stringent requirements on the hardware specifications and the analysis algorithms. Here we present DLTPulseGenerator, a library written in native C++ 11, which provides a simulation of lifetime spectra according to the measurement setup. The simulation is based on pairs of non-TTL detector output-pulses. Those pulses require the Constant Fraction Principle (CFD) for the determination of the exact timing signal and, thus, the calculation of the time difference i.e. the lifetime. To verify the functionality, simulation results were compared to experimentally obtained data using Positron Annihilation Lifetime Spectroscopy (PALS) on pure tin.
A binary tanglegram is a drawing of a pair of rooted binary trees whose leaf sets are in one-to-one correspondence; matching leaves are connected by inter-tree edges. For applications, for example, in phylogenetics, it is essential that both trees are drawn without edge crossings and that the inter-tree edges have as few crossings as possible. It is known that finding a tanglegram with the minimum number of crossings is NP-hard and that the problem is fixed-parameter tractable with respect to that number.
We prove that under the Unique Games Conjecture there is no constant-factor approximation for binary trees. We show that the problem is NP-hard even if both trees are complete binary trees. For this case we give an O(n 3)-time 2-approximation and a new, simple fixed-parameter algorithm. We show that the maximization version of the dual problem for binary trees can be reduced to a version of MaxCut for which the algorithm of Goemans and Williamson yields a 0.878-approximation.
Dynamic point cloud compression based on projections, surface reconstruction and video compression
(2021)
In this paper we will present a new dynamic point cloud compression based on different projection types and bit depth, combined with the surface reconstruction algorithm and video compression for obtained geometry and texture maps. Texture maps have been compressed after creating Voronoi diagrams. Used video compression is specific for geometry (FFV1) and texture (H.265/HEVC). Decompressed point clouds are reconstructed using a Poisson surface reconstruction algorithm. Comparison with the original point clouds was performed using point-to-point and point-to-plane measures. Comprehensive experiments show better performance for some projection maps: cylindrical, Miller and Mercator projections.
Experimental high-throughput analysis of molecular networks is a central approach to characterize the adaptation of plant metabolism to the environment. However, recent studies have demonstrated that it is hardly possible to predict in situ metabolic phenotypes from experiments under controlled conditions, such as growth chambers or greenhouses. This is particularly due to the high molecular variance of in situ samples induced by environmental fluctuations. An approach of functional metabolome interpretation of field samples would be desirable in order to be able to identify and trace back the impact of environmental changes on plant metabolism. To test the applicability of metabolomics studies for a characterization of plant populations in the field, we have identified and analyzed in situ samples of nearby grown natural populations of Arabidopsis thaliana in Austria. A. thaliana is the primary molecular biological model system in plant biology with one of the best functionally annotated genomes representing a reference system for all other plant genome projects. The genomes of these novel natural populations were sequenced and phylogenetically compared to a comprehensive genome database of A. thaliana ecotypes. Experimental results on primary and secondary metabolite profiling and genotypic variation were functionally integrated by a data mining strategy, which combines statistical output of metabolomics data with genome-derived biochemical pathway reconstruction and metabolic modeling. Correlations of biochemical model predictions and population-specific genetic variation indicated varying strategies of metabolic regulation on a population level which enabled the direct comparison, differentiation, and prediction of metabolic adaptation of the same species to different habitats. These differences were most pronounced at organic and amino acid metabolism as well as at the interface of primary and secondary metabolism and allowed for the direct classification of population-specific metabolic phenotypes within geographically contiguous sampling sites.
Effects of Acrophobic Fear and Trait Anxiety on Human Behavior in a Virtual Elevated Plus-Maze
(2021)
The Elevated Plus-Maze (EPM) is a well-established apparatus to measure anxiety in rodents, i.e., animals exhibiting an increased relative time spent in the closed vs. the open arms are considered anxious. To examine whether such anxiety-modulated behaviors are conserved in humans, we re-translated this paradigm to a human setting using virtual reality in a Cave Automatic Virtual Environment (CAVE) system. In two studies, we examined whether the EPM exploration behavior of humans is modulated by their trait anxiety and also assessed the individuals’ levels of acrophobia (fear of height), claustrophobia (fear of confined spaces), sensation seeking, and the reported anxiety when on the maze. First, we constructed an exact virtual copy of the animal EPM adjusted to human proportions. In analogy to animal EPM studies, participants (N = 30) freely explored the EPM for 5 min. In the second study (N = 61), we redesigned the EPM to make it more human-adapted and to differentiate influences of trait anxiety and acrophobia by introducing various floor textures and lower walls of closed arms to the height of standard handrails. In the first experiment, hierarchical regression analyses of exploration behavior revealed the expected association between open arm avoidance and Trait Anxiety, an even stronger association with acrophobic fear. In the second study, results revealed that acrophobia was associated with avoidance of open arms with mesh-floor texture, whereas for trait anxiety, claustrophobia, and sensation seeking, no effect was detected. Also, subjects’ fear rating was moderated by all psychometrics but trait anxiety. In sum, both studies consistently indicate that humans show no general open arm avoidance analogous to rodents and that human EPM behavior is modulated strongest by acrophobic fear, whereas trait anxiety plays a subordinate role. Thus, we conclude that the criteria for cross-species validity are met insufficiently in this case. Despite the exploratory nature, our studies provide in-depth insights into human exploration behavior on the virtual EPM.
Smart sensors and smartphones are becoming increasingly prevalent. Both can be used to gather environmental data (e.g., noise). Importantly, these devices can be connected to each other as well as to the Internet to collect large amounts of sensor data, which leads to many new opportunities. In particular, mobile crowdsensing techniques can be used to capture phenomena of common interest. Especially valuable insights can be gained if the collected data are additionally related to the time and place of the measurements. However, many technical solutions still use monolithic backends that are not capable of processing crowdsensing data in a flexible, efficient, and scalable manner. In this work, an architectural design was conceived with the goal to manage geospatial data in challenging crowdsensing healthcare scenarios. It will be shown how the proposed approach can be used to provide users with an interactive map of environmental noise, allowing tinnitus patients and other health-conscious people to avoid locations with harmful sound levels. Technically, the shown approach combines cloud-native applications with Big Data and stream processing concepts. In general, the presented architectural design shall serve as a foundation to implement practical and scalable crowdsensing platforms for various healthcare scenarios beyond the addressed use case.
Impaired decision-making leads to the inability to distinguish between advantageous and disadvantageous choices. The impairment of a person’s decision-making is a common goal of gambling games. Given the recent trend of gambling using immersive Virtual Reality it is crucial to investigate the effects of both immersion and the virtual environment (VE) on decision-making. In a novel user study, we measured decision-making using three virtual versions of the Iowa Gambling Task (IGT). The versions differed with regard to the degree of immersion and design of the virtual environment. While emotions affect decision-making, we further measured the positive and negative affect of participants. A higher visual angle on a stimulus leads to an increased emotional response. Thus, we kept the visual angle on the Iowa Gambling Task the same between our conditions. Our results revealed no significant impact of immersion or the VE on the IGT. We further found no significant difference between the conditions with regard to positive and negative affect. This suggests that neither the medium used nor the design of the VE causes an impairment of decision-making. However, in combination with a recent study, we provide first evidence that a higher visual angle on the IGT leads to an effect of impairment.
Background: Since the replication crisis, standardization has become even more important in psychological science and neuroscience. As a result, many methods are being reconsidered, and researchers’ degrees of freedom in these methods are being discussed as a potential source of inconsistencies across studies.
New Method: With the aim of addressing these subjectivity issues, we have been working on a tutorial-like EEG (pre-)processing pipeline to achieve an automated method based on the semi-automated analysis proposed by Delorme and Makeig.
Results: Two scripts are presented and explained step-by-step to perform basic, informed ERP and frequency-domain analyses, including data export to statistical programs and visual representations of the data. The open-source software EEGlab in MATLAB is used as the data handling platform, but scripts based on code provided by Mike Cohen (2014) are also included.
Comparison with existing methods: This accompanying tutorial-like article explains and shows how the processing of our automated pipeline affects the data and addresses, especially beginners in EEG-analysis, as other (pre)-processing chains are mostly targeting rather informed users in specialized areas or only parts of a complete procedure. In this context, we compared our pipeline with a selection of existing approaches.
Conclusion: The need for standardization and replication is evident, yet it is equally important to control the plausibility of the suggested solution by data exploration. Here, we provide the community with a tool to enhance the understanding and capability of EEG-analysis. We aim to contribute to comprehensive and reliable analyses for neuro-scientific research.
A centralized heterogeneous formation flight position control scheme has been formulated using an explicit model following design, based on a Linear Quadratic Regulator Proportional Integral (LQR PI) controller. The leader quadcopter is a stable reference model with desired dynamics whose output is perfectly tracked by the two wingmen quadcopters. The leader itself is controlled through the pole placement control method with desired stability characteristics, while the two followers are controlled through a robust and adaptive LQR PI control method. Selected 3-D formation geometry and static stability are maintained under a number of possible perturbations. With this control scheme, formation geometry may also be switched to any arbitrary shape during flight, provided a suitable collision avoidance mechanism is incorporated. In case of communication loss between the leader and any of the followers, the other follower provides the data, received from the leader, to the affected follower. The stability of the closed-loop system has been analyzed using singular values. The proposed approach for the tightly coupled formation flight of mini unmanned aerial vehicles has been validated with the help of extensive simulations using MATLAB/Simulink, which provided promising results.
Background: Natural language processing (NLP) is a powerful tool supporting the generation of Real-World Evidence (RWE). There is no NLP system that enables the extensive querying of parameters specific to multiple myeloma (MM) out of unstructured medical reports. We therefore created a MM-specific ontology to accelerate the information extraction (IE) out of unstructured text. Methods: Our MM ontology consists of extensive MM-specific and hierarchically structured attributes and values. We implemented “A Rule-based Information Extraction System” (ARIES) that uses this ontology. We evaluated ARIES on 200 randomly selected medical reports of patients diagnosed with MM. Results: Our system achieved a high F1-Score of 0.92 on the evaluation dataset with a precision of 0.87 and recall of 0.98. Conclusions: Our rule-based IE system enables the comprehensive querying of medical reports. The IE accelerates the extraction of data and enables clinicians to faster generate RWE on hematological issues. RWE helps clinicians to make decisions in an evidence-based manner. Our tool easily accelerates the integration of research evidence into everyday clinical practice.
Artificial Intelligence (AI) covers a broad spectrum of computational problems and use cases. Many of those implicate profound and sometimes intricate questions of how humans interact or should interact with AIs. Moreover, many users or future users do have abstract ideas of what AI is, significantly depending on the specific embodiment of AI applications. Human-centered-design approaches would suggest evaluating the impact of different embodiments on human perception of and interaction with AI. An approach that is difficult to realize due to the sheer complexity of application fields and embodiments in reality. However, here XR opens new possibilities to research human-AI interactions. The article’s contribution is twofold: First, it provides a theoretical treatment and model of human-AI interaction based on an XR-AI continuum as a framework for and a perspective of different approaches of XR-AI combinations. It motivates XR-AI combinations as a method to learn about the effects of prospective human-AI interfaces and shows why the combination of XR and AI fruitfully contributes to a valid and systematic investigation of human-AI interactions and interfaces. Second, the article provides two exemplary experiments investigating the aforementioned approach for two distinct AI-systems. The first experiment reveals an interesting gender effect in human-robot interaction, while the second experiment reveals an Eliza effect of a recommender system. Here the article introduces two paradigmatic implementations of the proposed XR testbed for human-AI interactions and interfaces and shows how a valid and systematic investigation can be conducted. In sum, the article opens new perspectives on how XR benefits human-centered AI design and development.
Background
Machine learning, especially deep learning, is becoming more and more relevant in research and development in the medical domain. For all the supervised deep learning applications, data is the most critical factor in securing successful implementation and sustaining the progress of the machine learning model. Especially gastroenterological data, which often involves endoscopic videos, are cumbersome to annotate. Domain experts are needed to interpret and annotate the videos. To support those domain experts, we generated a framework. With this framework, instead of annotating every frame in the video sequence, experts are just performing key annotations at the beginning and the end of sequences with pathologies, e.g., visible polyps. Subsequently, non-expert annotators supported by machine learning add the missing annotations for the frames in-between.
Methods
In our framework, an expert reviews the video and annotates a few video frames to verify the object’s annotations for the non-expert. In a second step, a non-expert has visual confirmation of the given object and can annotate all following and preceding frames with AI assistance. After the expert has finished, relevant frames will be selected and passed on to an AI model. This information allows the AI model to detect and mark the desired object on all following and preceding frames with an annotation. Therefore, the non-expert can adjust and modify the AI predictions and export the results, which can then be used to train the AI model.
Results
Using this framework, we were able to reduce workload of domain experts on average by a factor of 20 on our data. This is primarily due to the structure of the framework, which is designed to minimize the workload of the domain expert. Pairing this framework with a state-of-the-art semi-automated AI model enhances the annotation speed further. Through a prospective study with 10 participants, we show that semi-automated annotation using our tool doubles the annotation speed of non-expert annotators compared to a well-known state-of-the-art annotation tool.
Conclusion
In summary, we introduce a framework for fast expert annotation for gastroenterologists, which reduces the workload of the domain expert considerably while maintaining a very high annotation quality. The framework incorporates a semi-automated annotation system utilizing trained object detection models. The software and framework are open-source.
Automatic image reconstruction is critical to cope with steadily increasing data from advanced microscopy. We describe here the Fiji macro 3D ART VeSElecT which we developed to study synaptic vesicles in electron tomograms. We apply this tool to quantify vesicle properties (i) in embryonic Danio rerio 4 and 8 days past fertilization (dpf) and (ii) to compare Caenorhabditis elegans N2 neuromuscular junctions (NMJ) wild-type and its septin mutant (unc-59(e261)). We demonstrate development-specific and mutant-specific changes in synaptic vesicle pools in both models. We confirm the functionality of our macro by applying our 3D ART VeSElecT on zebrafish NMJ showing smaller vesicles in 8 dpf embryos then 4 dpf, which was validated by manual reconstruction of the vesicle pool. Furthermore, we analyze the impact of C. elegans septin mutant unc-59(e261) on vesicle pool formation and vesicle size. Automated vesicle registration and characterization was implemented in Fiji as two macros (registration and measurement). This flexible arrangement allows in particular reducing false positives by an optional manual revision step. Preprocessing and contrast enhancement work on image-stacks of 1nm/pixel in x and y direction. Semi-automated cell selection was integrated. 3D ART VeSElecT removes interfering components, detects vesicles by 3D segmentation and calculates vesicle volume and diameter (spherical approximation, inner/outer diameter). Results are collected in color using the RoiManager plugin including the possibility of manual removal of non-matching confounder vesicles. Detailed evaluation considered performance (detected vesicles) and specificity (true vesicles) as well as precision and recall. We furthermore show gain in segmentation and morphological filtering compared to learning based methods and a large time gain compared to manual segmentation. 3D ART VeSElecT shows small error rates and its speed gain can be up to 68 times faster in comparison to manual annotation. Both automatic and semi-automatic modes are explained including a tutorial.
Background
Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of Würzburg.
Methods
A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts.
Results
The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout.
Conclusions
The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports with very high precision and high recall on the majority of documents at the University Hospital of Würzburg. Extracted results populate a clinical data warehouse which supports clinical research.