@phdthesis{Reinermann2023, author = {Reinermann, Sophie}, title = {Earth Observation Time Series for Grassland Management Analyses - Development and large-scale Application of a Framework to detect Grassland Mowing Events in Germany}, doi = {10.25972/OPUS-32273}, url = {http://nbn-resolving.de/urn:nbn:de:bvb:20-opus-322737}, school = {Universit{\"a}t W{\"u}rzburg}, year = {2023}, abstract = {Grasslands shape many landscapes of the earth as they cover about one-third of its surface. They are home and provide livelihood for billions of people and are mainly used as source of forage for animals. However, grasslands fulfill many additional ecosystem functions next to fodder production, such as storage of carbon, water filtration, provision of habitats and cultural values. They play a role in climate change (mitigation) and in preserving biodiversity and ecosystem functions on a global scale. The degree to what these ecosystem functions are present within grassland ecosystems is largely determined by the management. Individual management practices and the use intensity influence the species composition as well as functions, like carbon storage, while higher use intensities (e.g. high mowing frequencies) usually show a negative impact. Especially in Central European countries, like in Germany, the determining influence of grassland management on its physiognomy and ecosystem functions leads to a large variability and small-scale alternations of grassland parcels. Large-scale information on the management and use intensity of grasslands is not available. Consequently, estimations of grassland ecosystem functions are challenging which, however, would be required for large-scale assessments of the status of grassland ecosystems and optimized management plans for the future. The topic of this thesis tackles this gap by investigating the major grassland management practice in Germany, which is mowing, for multiple years, in high spatial resolution and on a national scale. Earth Observation (EO) has the advantage of providing information of the earth's surface on multi-temporal time steps. An extensive literature review on the use of EO for grassland management and production analyses, which was part of this thesis, showed that in particular research on grasslands consisting of small parcels with a large variety of management and use intensity, like common in Central Europe, is underrepresented. Especially the launch of the Sentinel satellites in the recent past now enables the analyses of such grasslands due to their high spatial and temporal resolution. The literature review specifically on the investigation of grassland mowing events revealed that most previous studies focused on small study areas, were exploratory, only used one sensor type and/or lacked a reference data set with a complete range of management options. Within this thesis a novel framework to detect grassland mowing events over large areas is presented which was applied and validated for the entire area of Germany for multiple years (2018-2021). The potential of both sensor types, optical (Sentinel-2) and Synthetic Aperture Radar (SAR) (Sentinel-1) was investigated regarding grassland mowing event detection. Eight EO parameters were investigated, namely the Enhanced Vegetation Index (EVI), the backscatter intensity and the interferometric (InSAR) temporal coherence for both available polarization modes (VV and VH), and the polarimetric (PolSAR) decomposition parameters Entropy, K0 and K1. An extensive reference data set was generated based on daily images of webcams distributed in Germany which resulted in mowing information for grasslands with the entire possible range of mowing frequencies - from one to six in Germany - and in 1475 reference mowing events for the four years of interest. For the first time a observation-driven mowing detection approach including data from Sentinel-2 and Sentinel-1 and combining the two was developed, applied and validated on large scale. Based on a subset of the reference data (13 grassland parcels with 44 mowing events) from 2019 the EO parameters were investigated and the detection algorithm developed and parameterized. This analysis showed that a threshold-based change detection approach based on EVI captured grassland mowing events best, which only failed during periods of clouds. All SAR-based parameters showed a less consistent behavior to mowing events, with PolSAR Entropy and InSAR Coherence VH, however, revealing the highest potential among them. A second, combined approach based on EVI and a SARbased parameter was developed and tested for PolSAR Entropy and InSAR VH. To avoid additional false positive detections during periods in which mowing events are anyhow reliably detected using optical data, the SAR-based mowing detection was only initiated during long gaps within the optical time series (< 25 days). Application and validation of these approaches in a focus region revealed that only using EVI leads to the highest accuracies (F1-Score = 0.65) as combining this approach with SAR-based detection led to a strong increase in falsely detected mowing events resulting in a decrease of accuracies (EVI + PolSAR ENT F1-Score = 0.61; EVI + InSAR COH F1-Score = 0.61). The mowing detection algorithm based on EVI was applied for the entire area of Germany for the years 2018-2021. It was revealed that the largest share of grasslands with high mowing frequencies (at least four mowing events) can be found in southern/south-eastern Germany. Extensively used grassland (mown up to two times) is distributed within the entire country with larger shares in the center and north-eastern parts of Germany. These patterns stay constant in general, but small fluctuations between the years are visible. Early mown grasslands can be found in southern/south-eastern Germany - in line with high mowing frequency areas - but also in central-western parts. The years 2019 and 2020 revealed higher accuracies based on the 1475 mowing events of the multi-annual validation data set (F1-Scores of 0.64 and 0.63), 2018 and 2021 lower ones (F1-Score of 0.52 and 0.50). Based on this new, unprecedented data set, potential influencing factors on the mowing dynamics were investigated. Therefore, climate, topography, soil data and information on conservation schemes were related to mowing dynamics for the year 2020, which showed a high number of valid observations and detection accuracy. It was revealed that there are no strong linear relationships between the mowing frequency or the timing of the first mowing event and the investigated variables. However, it was found that for intensive grassland usage certain climatic and topographic conditions have to be fulfilled, while extensive grasslands appear on the entire spectrum of these variables. Further, higher mowing frequencies occur on soils with influence of ground water and lower mowing frequencies in protected areas. These results show the complex interplay between grassland mowing dynamics and external influences and highlight the challenges of policies aiming to protect grassland ecosystem functions and their need to be adapted to regional circumstances.}, subject = {Gr{\"u}nland}, language = {en} } @phdthesis{Zuefle2022, author = {Z{\"u}fle, Marwin Otto}, title = {Proactive Critical Event Prediction based on Monitoring Data with Focus on Technical Systems}, doi = {10.25972/OPUS-25575}, url = {http://nbn-resolving.de/urn:nbn:de:bvb:20-opus-255757}, school = {Universit{\"a}t W{\"u}rzburg}, year = {2022}, abstract = {The importance of proactive and timely prediction of critical events is steadily increasing, whether in the manufacturing industry or in private life. In the past, machines in the manufacturing industry were often maintained based on a regular schedule or threshold violations, which is no longer competitive as it causes unnecessary costs and downtime. In contrast, the predictions of critical events in everyday life are often much more concealed and hardly noticeable to the private individual, unless the critical event occurs. For instance, our electricity provider has to ensure that we, as end users, are always supplied with sufficient electricity, or our favorite streaming service has to guarantee that we can watch our favorite series without interruptions. For this purpose, they have to constantly analyze what the current situation is, how it will develop in the near future, and how they have to react in order to cope with future conditions without causing power outages or video stalling. In order to analyze the performance of a system, monitoring mechanisms are often integrated to observe characteristics that describe the workload and the state of the system and its environment. Reactive systems typically employ thresholds, utility functions, or models to determine the current state of the system. However, such reactive systems cannot proactively estimate future events, but only as they occur. In the case of critical events, reactive determination of the current system state is futile, whereas a proactive system could have predicted this event in advance and enabled timely countermeasures. To achieve proactivity, the system requires estimates of future system states. Given the gap between design time and runtime, it is typically not possible to use expert knowledge to a priori model all situations a system might encounter at runtime. Therefore, prediction methods must be integrated into the system. Depending on the available monitoring data and the complexity of the prediction task, either time series forecasting in combination with thresholding or more sophisticated machine and deep learning models have to be trained. Although numerous forecasting methods have been proposed in the literature, these methods have their advantages and disadvantages depending on the characteristics of the time series under consideration. Therefore, expert knowledge is required to decide which forecasting method to choose. However, since the time series observed at runtime cannot be known at design time, such expert knowledge cannot be implemented in the system. In addition to selecting an appropriate forecasting method, several time series preprocessing steps are required to achieve satisfactory forecasting accuracy. In the literature, this preprocessing is often done manually, which is not practical for autonomous computing systems, such as Self-Aware Computing Systems. Several approaches have also been presented in the literature for predicting critical events based on multivariate monitoring data using machine and deep learning. However, these approaches are typically highly domain-specific, such as financial failures, bearing failures, or product failures. Therefore, they require in-depth expert knowledge. For this reason, these approaches cannot be fully automated and are not transferable to other use cases. Thus, the literature lacks generalizable end-to-end workflows for modeling, detecting, and predicting failures that require only little expert knowledge. To overcome these shortcomings, this thesis presents a system model for meta-self-aware prediction of critical events based on the LRA-M loop of Self-Aware Computing Systems. Building upon this system model, this thesis provides six further contributions to critical event prediction. While the first two contributions address critical event prediction based on univariate data via time series forecasting, the three subsequent contributions address critical event prediction for multivariate monitoring data using machine and deep learning algorithms. Finally, the last contribution addresses the update procedure of the system model. Specifically, the seven main contributions of this thesis can be summarized as follows: First, we present a system model for meta self-aware prediction of critical events. To handle both univariate and multivariate monitoring data, it offers univariate time series forecasting for use cases where a single observed variable is representative of the state of the system, and machine learning algorithms combined with various preprocessing techniques for use cases where a large number of variables are observed to characterize the system's state. However, the two different modeling alternatives are not disjoint, as univariate time series forecasts can also be included to estimate future monitoring data as additional input to the machine learning models. Finally, a feedback loop is incorporated to monitor the achieved prediction quality and trigger model updates. We propose a novel hybrid time series forecasting method for univariate, seasonal time series, called Telescope. To this end, Telescope automatically preprocesses the time series, performs a kind of divide-and-conquer technique to split the time series into multiple components, and derives additional categorical information. It then forecasts the components and categorical information separately using a specific state-of-the-art method for each component. Finally, Telescope recombines the individual predictions. As Telescope performs both preprocessing and forecasting automatically, it represents a complete end-to-end approach to univariate seasonal time series forecasting. Experimental results show that Telescope achieves enhanced forecast accuracy, more reliable forecasts, and a substantial speedup. Furthermore, we apply Telescope to the scenario of predicting critical events for virtual machine auto-scaling. Here, results show that Telescope considerably reduces the average response time and significantly reduces the number of service level objective violations. For the automatic selection of a suitable forecasting method, we introduce two frameworks for recommending forecasting methods. The first framework extracts various time series characteristics to learn the relationship between them and forecast accuracy. In contrast, the other framework divides the historical observations into internal training and validation parts to estimate the most appropriate forecasting method. Moreover, this framework also includes time series preprocessing steps. Comparisons between the proposed forecasting method recommendation frameworks and the individual state-of-the-art forecasting methods and the state-of-the-art forecasting method recommendation approach show that the proposed frameworks considerably improve the forecast accuracy. With regard to multivariate monitoring data, we first present an end-to-end workflow to detect critical events in technical systems in the form of anomalous machine states. The end-to-end design includes raw data processing, phase segmentation, data resampling, feature extraction, and machine tool anomaly detection. In addition, the workflow does not rely on profound domain knowledge or specific monitoring variables, but merely assumes standard machine monitoring data. We evaluate the end-to-end workflow using data from a real CNC machine. The results indicate that conventional frequency analysis does not detect the critical machine conditions well, while our workflow detects the critical events very well with an F1-score of almost 91\%. To predict critical events rather than merely detecting them, we compare different modeling alternatives for critical event prediction in the use case of time-to-failure prediction of hard disk drives. Given that failure records are typically significantly less frequent than instances representing the normal state, we employ different oversampling strategies. Next, we compare the prediction quality of binary class modeling with downscaled multi-class modeling. Furthermore, we integrate univariate time series forecasting into the feature generation process to estimate future monitoring data. Finally, we model the time-to-failure using not only classification models but also regression models. The results suggest that multi-class modeling provides the overall best prediction quality with respect to practical requirements. In addition, we prove that forecasting the features of the prediction model significantly improves the critical event prediction quality. We propose an end-to-end workflow for predicting critical events of industrial machines. Again, this approach does not rely on expert knowledge except for the definition of monitoring data, and therefore represents a generalizable workflow for predicting critical events of industrial machines. The workflow includes feature extraction, feature handling, target class mapping, and model learning with integrated hyperparameter tuning via a grid-search technique. Drawing on the result of the previous contribution, the workflow models the time-to-failure prediction in terms of multiple classes, where we compare different labeling strategies for multi-class classification. The evaluation using real-world production data of an industrial press demonstrates that the workflow is capable of predicting six different time-to-failure windows with a macro F1-score of 90\%. When scaling the time-to-failure classes down to a binary prediction of critical events, the F1-score increases to above 98\%. Finally, we present four update triggers to assess when critical event prediction models should be re-trained during on-line application. Such re-training is required, for instance, due to concept drift. The update triggers introduced in this thesis take into account the elapsed time since the last update, the prediction quality achieved on the current test data, and the prediction quality achieved on the preceding test data. We compare the different update strategies with each other and with the static baseline model. The results demonstrate the necessity of model updates during on-line application and suggest that the update triggers that consider both the prediction quality of the current and preceding test data achieve the best trade-off between prediction quality and number of updates required. We are convinced that the contributions of this thesis constitute significant impulses for the academic research community as well as for practitioners. First of all, to the best of our knowledge, we are the first to propose a fully automated, end-to-end, hybrid, component-based forecasting method for seasonal time series that also includes time series preprocessing. Due to the combination of reliably high forecast accuracy and reliably low time-to-result, it offers many new opportunities in applications requiring accurate forecasts within a fixed time period in order to take timely countermeasures. In addition, the promising results of the forecasting method recommendation systems provide new opportunities to enhance forecasting performance for all types of time series, not just seasonal ones. Furthermore, we are the first to expose the deficiencies of the prior state-of-the-art forecasting method recommendation system. Concerning the contributions to critical event prediction based on multivariate monitoring data, we have already collaborated closely with industrial partners, which supports the practical relevance of the contributions of this thesis. The automated end-to-end design of the proposed workflows that do not demand profound domain or expert knowledge represents a milestone in bridging the gap between academic theory and industrial application. Finally, the workflow for predicting critical events in industrial machines is currently being operationalized in a real production system, underscoring the practical impact of this thesis.}, subject = {Prognose}, language = {en} } @phdthesis{Uereyen2022, author = {{\"U}reyen, Soner}, title = {Multivariate Time Series for the Analysis of Land Surface Dynamics - Evaluating Trends and Drivers of Land Surface Variables for the Indo-Gangetic River Basins}, doi = {10.25972/OPUS-29194}, url = {http://nbn-resolving.de/urn:nbn:de:bvb:20-opus-291941}, school = {Universit{\"a}t W{\"u}rzburg}, year = {2022}, abstract = {The investigation of the Earth system and interplays between its components is of utmost importance to enhance the understanding of the impacts of global climate change on the Earth's land surface. In this context, Earth observation (EO) provides valuable long-term records covering an abundance of land surface variables and, thus, allowing for large-scale analyses to quantify and analyze land surface dynamics across various Earth system components. In view of this, the geographical entity of river basins was identified as particularly suitable for multivariate time series analyses of the land surface, as they naturally cover diverse spheres of the Earth. Many remote sensing missions with different characteristics are available to monitor and characterize the land surface. Yet, only a few spaceborne remote sensing missions enable the generation of spatio-temporally consistent time series with equidistant observations over large areas, such as the MODIS instrument. In order to summarize available remote sensing-based analyses of land surface dynamics in large river basins, a detailed literature review of 287 studies was performed and several research gaps were identified. In this regard, it was found that studies rarely analyzed an entire river basin, but rather focused on study areas at subbasin or regional scale. In addition, it was found that transboundary river basins remained understudied and that studies largely focused on selected riparian countries. Moreover, the analysis of environmental change was generally conducted using a single EO-based land surface variable, whereas a joint exploration of multivariate land surface variables across spheres was found to be rarely performed. To address these research gaps, a methodological framework enabling (1) the preprocessing and harmonization of multi-source time series as well as (2) the statistical analysis of a multivariate feature space was required. For development and testing of a methodological framework that is transferable in space and time, the transboundary river basins Indus, Ganges, Brahmaputra, and Meghna (IGBM) in South Asia were selected as study area, having a size equivalent to around eight times the size of Germany. These basins largely depend on water resources from monsoon rainfall and High Mountain Asia which holds the largest ice mass outside the polar regions. In total, over 1.1 billion people live in this region and in parts largely depend on these water resources which are indispensable for the world's largest connected irrigated croplands and further domestic needs as well. With highly heterogeneous geographical settings, these river basins allow for a detailed analysis of the interplays between multiple spheres, including the anthroposphere, biosphere, cryosphere, hydrosphere, lithosphere, and atmosphere. In this thesis, land surface dynamics over the last two decades (December 2002 - November 2020) were analyzed using EO time series on vegetation condition, surface water area, and snow cover area being based on MODIS imagery, the DLR Global WaterPack and JRC Global Surface Water Layer, as well as the DLR Global SnowPack, respectively. These data were evaluated in combination with further climatic, hydrological, and anthropogenic variables to estimate their influence on the three EO land surface variables. The preprocessing and harmonization of the time series was conducted using the implemented framework. The resulting harmonized feature space was used to quantify and analyze land surface dynamics by means of several statistical time series analysis techniques which were integrated into the framework. In detail, these methods involved (1) the calculation of trends using the Mann-Kendall test in association with the Theil-Sen slope estimator, (2) the estimation of changes in phenological metrics using the Timesat tool, (3) the evaluation of driving variables using the causal discovery approach Peter and Clark Momentary Conditional Independence (PCMCI), and (4) additional correlation tests to analyze the human influence on vegetation condition and surface water area. These analyses were performed at annual and seasonal temporal scale and for diverse spatial units, including grids, river basins and subbasins, land cover and land use classes, as well as elevation-dependent zones. The trend analyses of vegetation condition mostly revealed significant positive trends. Irrigated and rainfed croplands were found to contribute most to these trends. The trend magnitudes were particularly high in arid and semi-arid regions. Considering surface water area, significant positive trends were obtained at annual scale. At grid scale, regional and seasonal clusters with significant negative trends were found as well. Trends for snow cover area mostly remained stable at annual scale, but significant negative trends were observed in parts of the river basins during distinct seasons. Negative trends were also found for the elevation-dependent zones, particularly at high altitudes. Also, retreats in the seasonal duration of snow cover area were found in parts of the river basins. Furthermore, for the first time, the application of the causal discovery algorithm on a multivariate feature space at seasonal temporal scale revealed direct and indirect links between EO land surface variables and respective drivers. In general, vegetation was constrained by water availability, surface water area was largely influenced by river discharge and indirectly by precipitation, and snow cover area was largely controlled by precipitation and temperature with spatial and temporal variations. Additional analyses pointed towards positive human influences on increasing trends in vegetation greenness. The investigation of trends and interplays across spheres provided new and valuable insights into the past state and the evolution of the land surface as well as on relevant climatic and hydrological driving variables. Besides the investigated river basins in South Asia, these findings are of great value also for other river basins and geographical regions.}, subject = {Multivariate Analyse}, language = {en} } @phdthesis{Colditz2007, author = {Colditz, Rene Roland}, title = {Time Series Generation and Classification of MODIS Data for Land Cover Mapping}, url = {http://nbn-resolving.de/urn:nbn:de:bvb:20-opus-25908}, school = {Universit{\"a}t W{\"u}rzburg}, year = {2007}, abstract = {Processes of the Earth's surface occur at different scales of time and intensity. Climate in particular determines the activity and seasonal development of vegetation. These dynamics are predominantly driven by temperature in the humid mid-latitudes and by the availability of water in semi-arid regions. Human activities are a modifying parameter for many ecosystems and can become the prime force in well-developed regions with an intensively managed environment. Accounting for these dynamics, i.e. seasonal dynamics of ecosystems and short- to long-term changes in land-cover composition, requires multiple measurements in time. With respect to the characterization of the Earth surface and its transformation due to global warming and human-induced global change, there is a need for appropriate data and methods to determine the activity of vegetation and the change of land cover. Space-borne remote sensing is capable of monitoring the activity and development of vegetation as well as changes of the land surface. In many instances, satellite images are the only means to comprehensively assess the surface characteristics of large areas. A high temporal frequency of image acquisition, forming a time series of satellite data, can be employed for mapping the development of vegetation in space and time. Time series allow for detecting and assessing changes and multi-year transformation processes of high and low intensity, or even abrupt events such as fire and flooding. The operational processing of satellite data and automated information-extraction techniques are the basis for consistent and continuous long-term product generation. This provides the potential for directly using remote-sensing data and products for analyzing the land surface in relation to global warming and global change, including deforestation and land transformation. This study aims at the development of an advanced approach to time-series generation using data-quality indicators. A second goal focuses on the application of time series for automated land-cover classification and update, using fractional cover estimates to accommodate for the comparatively coarse spatial resolution. Requirements of this study are the robustness and high accuracy of the approaches as well as the full transferability to other regions and datasets. In this respect, the developments of this study form a methodological framework, which can be filled with appropriate modules for a specific sensor and application. In order to attain the first goal, time-series compilation, a stand-alone software application called TiSeG (Time Series Generator) has been developed. TiSeG evaluates the pixel-level quality indicators provided with each MODIS land product. It computes two important data-availability indicators, the number of invalid pixels and the maximum gap length. Both indices are visualized in time and space, indicating the feasibility of temporal interpolation. The level of desired data quality can be modified spatially and temporally to account for distinct environments in a larger study area and for seasonal differences. Pixels regarded as invalid are either masked or interpolated with spatial or temporal techniques.}, subject = {Zeitreihe}, language = {en} }