TY - JOUR A1 - Dhillon, Maninder Singh A1 - Dahms, Thorsten A1 - Kuebert-Flock, Carina A1 - Rummler, Thomas A1 - Arnault, Joel A1 - Steffan-Dewenter, Ingolf A1 - Ullmann, Tobias T1 - Integrating random forest and crop modeling improves the crop yield prediction of winter wheat and oil seed rape JF - Frontiers in Remote Sensing N2 - The fast and accurate yield estimates with the increasing availability and variety of global satellite products and the rapid development of new algorithms remain a goal for precision agriculture and food security. However, the consistency and reliability of suitable methodologies that provide accurate crop yield outcomes still need to be explored. The study investigates the coupling of crop modeling and machine learning (ML) to improve the yield prediction of winter wheat (WW) and oil seed rape (OSR) and provides examples for the Free State of Bavaria (70,550 km2), Germany, in 2019. The main objectives are to find whether a coupling approach [Light Use Efficiency (LUE) + Random Forest (RF)] would result in better and more accurate yield predictions compared to results provided with other models not using the LUE. Four different RF models [RF1 (input: Normalized Difference Vegetation Index (NDVI)), RF2 (input: climate variables), RF3 (input: NDVI + climate variables), RF4 (input: LUE generated biomass + climate variables)], and one semi-empiric LUE model were designed with different input requirements to find the best predictors of crop monitoring. The results indicate that the individual use of the NDVI (in RF1) and the climate variables (in RF2) could not be the most accurate, reliable, and precise solution for crop monitoring; however, their combined use (in RF3) resulted in higher accuracies. Notably, the study suggested the coupling of the LUE model variables to the RF4 model can reduce the relative root mean square error (RRMSE) from −8% (WW) and −1.6% (OSR) and increase the R 2 by 14.3% (for both WW and OSR), compared to results just relying on LUE. Moreover, the research compares models yield outputs by inputting three different spatial inputs: Sentinel-2(S)-MOD13Q1 (10 m), Landsat (L)-MOD13Q1 (30 m), and MOD13Q1 (MODIS) (250 m). The S-MOD13Q1 data has relatively improved the performance of models with higher mean R 2 [0.80 (WW), 0.69 (OSR)], and lower RRMSE (%) (9.18, 10.21) compared to L-MOD13Q1 (30 m) and MOD13Q1 (250 m). Satellite-based crop biomass, solar radiation, and temperature are found to be the most influential variables in the yield prediction of both crops. KW - crop modeling KW - random forest KW - machine learning KW - NDVI KW - satellite KW - landsat KW - sentinel-2 KW - winter wheat Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-301462 SN - 2673-6187 VL - 3 ER -