004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- yes (285)
Year of publication
Document Type
- Journal article (127)
- Doctoral Thesis (80)
- Working Paper (37)
- Preprint (19)
- Conference Proceeding (9)
- Jahresbericht (5)
- Master Thesis (4)
- Report (3)
- Other (1)
Language
- English (257)
- German (27)
- Multiple languages (1)
Keywords
- virtual reality (16)
- Datennetz (14)
- Leistungsbewertung (13)
- Quran (8)
- Robotik (8)
- Koran (7)
- Mobiler Roboter (7)
- Text Mining (7)
- Autonomer Roboter (6)
- Simulation (6)
Institute
- Institut für Informatik (203)
- Theodor-Boveri-Institut für Biowissenschaften (29)
- Institut Mensch - Computer - Medien (17)
- Institut für deutsche Philologie (17)
- Institut für Klinische Epidemiologie und Biometrie (7)
- Rechenzentrum (7)
- Center for Computational and Theoretical Biology (4)
- Graduate School of Science and Technology (3)
- Medizinische Klinik und Poliklinik II (3)
- Institut für Funktionsmaterialien und Biofabrikation (2)
Schriftenreihe
Sonstige beteiligte Institutionen
- Cologne Game Lab (2)
- Birmingham City University (1)
- DATE Lab, KITE Research Insititute, University Health Network, Toronto, Canada (1)
- EMBL Heidelberg (1)
- INAF Padova, Italy (1)
- Jacobs University Bremen, Germany (1)
- Open University of the Netherlands (1)
- Servicezentrum Medizin-Informatik (Universitätsklinikum) (1)
- Social and Technological Systems (SaTS) lab, School of Art, Media, Performance and Design, York University, Toronto, Canada (1)
- TH Köln (1)
In many real world settings, imbalanced data impedes model performance of learning algorithms, like neural networks, mostly for rare cases. This is especially problematic for tasks focusing on these rare occurrences. For example, when estimating precipitation, extreme rainfall events are scarce but important considering their potential consequences. While there are numerous well studied solutions for classification settings, most of them cannot be applied to regression easily. Of the few solutions for regression tasks, barely any have explored cost-sensitive learning which is known to have advantages compared to sampling-based methods in classification tasks. In this work, we propose a sample weighting approach for imbalanced regression datasets called DenseWeight and a cost-sensitive learning approach for neural network regression with imbalanced data called DenseLoss based on our weighting scheme. DenseWeight weights data points according to their target value rarities through kernel density estimation (KDE). DenseLoss adjusts each data point’s influence on the loss according to DenseWeight, giving rare data points more influence on model training compared to common data points. We show on multiple differently distributed datasets that DenseLoss significantly improves model performance for rare data points through its density-based weighting scheme. Additionally, we compare DenseLoss to the state-of-the-art method SMOGN, finding that our method mostly yields better performance. Our approach provides more control over model training as it enables us to actively decide on the trade-off between focusing on common or rare cases through a single hyperparameter, allowing the training of better models for rare data points.
In the last decades, the classical Vehicle Routing Problem (VRP), i.e., assigning a set of orders to vehicles and planning their routes has been intensively researched. As only the assignment of order to vehicles and their routes is already an NP-complete problem, the application of these algorithms in practice often fails to take into account the constraints and restrictions that apply in real-world applications, the so called rich VRP (rVRP) and are limited to single aspects. In this work, we incorporate the main relevant real-world constraints and requirements. We propose a two-stage strategy and a Timeline algorithm for time windows and pause times, and apply a Genetic Algorithm (GA) and Ant Colony Optimization (ACO) individually to the problem to find optimal solutions. Our evaluation of eight different problem instances against four state-of-the-art algorithms shows that our approach handles all given constraints in a reasonable time.
Smart sensors and smartphones are becoming increasingly prevalent. Both can be used to gather environmental data (e.g., noise). Importantly, these devices can be connected to each other as well as to the Internet to collect large amounts of sensor data, which leads to many new opportunities. In particular, mobile crowdsensing techniques can be used to capture phenomena of common interest. Especially valuable insights can be gained if the collected data are additionally related to the time and place of the measurements. However, many technical solutions still use monolithic backends that are not capable of processing crowdsensing data in a flexible, efficient, and scalable manner. In this work, an architectural design was conceived with the goal to manage geospatial data in challenging crowdsensing healthcare scenarios. It will be shown how the proposed approach can be used to provide users with an interactive map of environmental noise, allowing tinnitus patients and other health-conscious people to avoid locations with harmful sound levels. Technically, the shown approach combines cloud-native applications with Big Data and stream processing concepts. In general, the presented architectural design shall serve as a foundation to implement practical and scalable crowdsensing platforms for various healthcare scenarios beyond the addressed use case.
A bipartite graph G=(U,V,E) is convex if the vertices in V can be linearly ordered such that for each vertex u∈U, the neighbors of u are consecutive in the ordering of V. An induced matching H of G is a matching for which no edge of E connects endpoints of two different edges of H. We show that in a convex bipartite graph with n vertices and m weighted edges, an induced matching of maximum total weight can be computed in O(n+m) time. An unweighted convex bipartite graph has a representation of size O(n) that records for each vertex u∈U the first and last neighbor in the ordering of V. Given such a compact representation, we compute an induced matching of maximum cardinality in O(n) time. In convex bipartite graphs, maximum-cardinality induced matchings are dual to minimum chain covers. A chain cover is a covering of the edge set by chain subgraphs, that is, subgraphs that do not contain induced matchings of more than one edge. Given a compact representation, we compute a representation of a minimum chain cover in O(n) time. If no compact representation is given, the cover can be computed in O(n+m) time. All of our algorithms achieve optimal linear running time for the respective problem and model, and they improve and generalize the previous results in several ways: The best algorithms for the unweighted problem versions had a running time of O(n\(^{2}\)) (Brandstädt et al. in Theor. Comput. Sci. 381(1–3):260–265, 2007. https://doi.org/10.1016/j.tcs.2007.04.006). The weighted case has not been considered before.
The joint 1st Workshop on Evaluations and Measurements in Self-Aware Computing Systems (EMSAC 2019) and Workshop on Self-Aware Computing (SeAC) was held as part of the FAS* conference alliance in conjunction with the 16th IEEE International Conference on Autonomic Computing (ICAC) and the 13th IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO) in Umeå, Sweden on 20 June 2019. The goal of this one-day workshop was to bring together researchers and practitioners from academic environments and from the industry to share their solutions, ideas, visions, and doubts in self-aware computing systems in general and in the evaluation and measurements of such systems in particular. The workshop aimed to enable discussions, partnerships, and collaborations among the participants. This special issue follows the theme of the workshop. It contains extended versions of workshop presentations as well as additional contributions.
Even today, the automatic digitisation of scanned documents in general, but especially the automatic optical music recognition (OMR) of historical manuscripts, still remains an enormous challenge, since both handwritten musical symbols and text have to be identified. This paper focuses on the Medieval so-called square notation developed in the 11th–12th century, which is already composed of staff lines, staves, clefs, accidentals, and neumes that are roughly spoken connected single notes. The aim is to develop an algorithm that captures both the neumes, and in particular its melody, which can be used to reconstruct the original writing. Our pipeline is similar to the standard OMR approach and comprises a novel staff line and symbol detection algorithm based on deep Fully Convolutional Networks (FCN), which perform pixel-based predictions for either staff lines or symbols and their respective types. Then, the staff line detection combines the extracted lines to staves and yields an F\(_1\) -score of over 99% for both detecting lines and complete staves. For the music symbol detection, we choose a novel approach that skips the step to identify neumes and instead directly predicts note components (NCs) and their respective affiliation to a neume. Furthermore, the algorithm detects clefs and accidentals. Our algorithm predicts the symbol sequence of a staff with a diplomatic symbol accuracy rate (dSAR) of about 87%, which includes symbol type and location. If only the NCs without their respective connection to a neume, all clefs and accidentals are of interest, the algorithm reaches an harmonic symbol accuracy rate (hSAR) of approximately 90%. In general, the algorithm recognises a symbol in the manuscript with an F\(_1\) -score of over 96%.
Mobile 3D fluoroscopes have become increasingly available in neurosurgical operating rooms. We recently reported its use for imaging cerebral vascular malformations and aneurysms. This study was conducted to evaluate various radiation settings for the imaging of cerebral aneurysms before and after surgical occlusion. Eighteen patients with cerebral aneurysms with the indication for surgical clipping were included in this prospective analysis. Before surgery the patients were randomized into one of three different scan protocols according (default settings of the 3D fluoroscope): Group 1: 110 kV, 80 mA (enhanced cranial mode), group 2: 120 kV, 64 mA (lumbar spine mode), group 3: 120 kV, 25 mA (head/neck settings). Prior to surgery, a rotational fluoroscopy scan (duration 24 s) was performed without contrast agent followed by another scan with 50 ml of intravenous iodine contrast agent. The image files of both scans were transferred to an Apple PowerMac(R) workstation, subtracted and reconstructed using OsiriX(R) MD 10.0 software. The procedure was repeated after clip placement. The image quality regarding preoperative aneurysm configuration and postoperative assessment of aneurysm occlusion and vessel patency was analyzed by 2 independent reviewers using a 6-grade scale. This technique quickly supplies images of adequate quality to depict intracranial aneurysms and distal vessel patency after aneurysm clipping. Regarding these features, a further optimization to our previous protocol seems possible lowering the voltage and increasing tube current. For quick intraoperative assessment, image subtraction seems not necessary. Thus, a native scan without a contrast agent is not necessary. Further optimization may be possible using a different contrast injection protocol.
This study provides a systematic literature review of research (2001–2020) in the field of teaching and learning a foreign language and intercultural learning using immersive technologies. Based on 2507 sources, 54 articles were selected according to a predefined selection criteria. The review is aimed at providing information about which immersive interventions are being used for foreign language learning and teaching and where potential research gaps exist. The papers were analyzed and coded according to the following categories: (1) investigation form and education level, (2) degree of immersion, and technology used, (3) predictors, and (4) criterions. The review identified key research findings relating the use of immersive technologies for learning and teaching a foreign language and intercultural learning at cognitive, affective, and conative levels. The findings revealed research gaps in the area of teachers as a target group, and virtual reality (VR) as a fully immersive intervention form. Furthermore, the studies reviewed rarely examined behavior, and implicit measurements related to inter- and trans-cultural learning and teaching. Inter- and transcultural learning and teaching especially is an underrepresented investigation subject. Finally, concrete suggestions for future research are given. The systematic review contributes to the challenge of interdisciplinary cooperation between pedagogy, foreign language didactics, and Human-Computer Interaction to achieve innovative teaching-learning formats and a successful digital transformation.
In this doctoral thesis we cover the performance evaluation of next generation data plane architectures, comprised of complex software as well as programmable hardware components that allow fine granular configuration. In the scope of the thesis we propose mechanisms to monitor the performance of singular components and model key performance indicators of software based packet processing solutions. We present novel approaches towards network abstraction that allow the integration of heterogeneous data plane technologies into a singular network while maintaining total transparency between control and data plane. Finally, we investigate a full, complex system consisting of multiple software-based solutions and perform a detailed performance analysis. We employ simulative approaches to investigate overload control mechanisms that allow efficient operation under adversary conditions. The contributions of this work build the foundation for future research in the areas of network softwarization and network function virtualization.
The charged aerosol detector (CAD) is the latest representative of aerosol-based detectors that generate a response independent of the analytes' chemical structure. This study was aimed at accurately predicting the CAD response of homologous fatty acids under varying experimental conditions. Fatty acids from C12 to C18 were used as model substances due to semivolatile characterics that caused non-uniform CAD behaviour. Considering both experimental conditions and molecular descriptors, a mixed quantitative structure-property relationship (QSPR) modeling was performed using Gradient Boosted Trees (GBT). The ensemble of 10 decisions trees (learning rate set at 0.55, the maximal depth set at 5, and the sample rate set at 1.0) was able to explain approximately 99% (Q\(^2\): 0.987, RMSE: 0.051) of the observed variance in CAD responses. Validation using an external test compound confirmed the high predictive ability of the model established (R-2: 0.990, RMSEP: 0.050). With respect to the intrinsic attribute selection strategy, GBT used almost all independent variables during model building. Finally, it attributed the highest importance to the power function value, the flow rate of the mobile phase, evaporation temperature, the content of the organic solvent in the mobile phase and the molecular descriptors such as molecular weight (MW), Radial Distribution Function-080/weighted by mass (RDF080m) and average coefficient of the last eigenvector from distance/detour matrix (Ve2_D/Dt). The identification of the factors most relevant to the CAD responsiveness has contributed to a better understanding of the underlying mechanisms of signal generation. An increased CAD response that was obtained for acetone as organic modifier demonstrated its potential to replace the more expensive and environmentally harmful acetonitrile.