WP1. Big Data framework: data management and processing.

In order to find the most appropriate architecture for the technical solution, many database technologies have been analysed and compared. Different storage paradigms have been also considered, as well as the technologies that can make possible the management of the data and the performance of different operations defined by the processing algorithms of the project.

The final solution is based on a distributed file system designed to run on basic hardware. Another important component of the solution is a unified analytics engine for large-scale data processing. Finally, the tool combination is completed with a geographic data processing engine for high performance applications.

Finally, a geo-raster API has been developed to have access to the data uploaded to the storage unit. The data stored can then be downloaded in two different formats

WP2. Data-mining and data-driven modelling.

The picture below shows the general methodology followed in WP2.

Final outcomes for WP2 are summarized as follows:

Hotspot detection using Earth Observed datasets and ground station data

        • The hotspot indicator makes use of moving windows of various lengths to analyse the resilience of vegetation to climate extremes in the short-term and long-term.
        • The Standardized Precipitation Evapotranspiration Index (SPEI) and Standardized Precipitation Index (SPI) indicators are used for spatial analysis of droughts in time by identifying the percentage areas affected by extreme droughts.

Vulnerability and Resilience from Earth Observed datasets

        • Indicators for Vulnerability and Resilience are generated from the short-term responses of NDVI to anomalies in temperature and moisture.

Data-driven machine learning for crop yield prediction and forecasting

        • Satellite data such as evapotranspiration and vegetation indexes (normalized difference vegetation index, leaf area index) as well as climate status indicators such as rainfall, growing degree days, maximum, minimum and mean temperature were used for building machine learning models for crop yield prediction of winter wheat in Southern Spain. It was found that satellite crop status data enhances the predictive performance of the crop-yield prediction models especially during extreme climatic conditions.

Vegetation dynamics using NDVI

        • Data-driven modelling techniques were used to analyse and model NDVI time series within a non-irrigated crop region in the province of Soria, as representative of the drylands cropping systems in Spain.
        • The most significant seasonality pattern was observed at the annual horizon, indicating that there is cultivation every year. Two and three intra-annual cycles were also identified. The proposed hypothesis is that this pattern could be due to the mix of crop or management practices.

WP3. Historical Process-based models using EO data.

In WP3 the modelling schemes for evapotranspiration (ET) and Gross Primary Productivity (GPP) were improved, making use for this purpose of MODIS satellite products.

The image below shows the conceptual scheme of the modelling approach for actual evapotranspiration and gross primary productivity, showing how they are down regulated from potential values by same environmental stresses, linked by the vegetation component as transpiration and CO2 assimilation are both regulated by stomata in leaves.

Then, the improved models obtained were employed to produce gridded output variables of ET daily estimates, Net Radiation and Evaporative Fraction (EF) for the study region (Iberian Peninsula) for the 2001-2017 period. In addition, the NPP outputs for the long term dataset (1982-2015) have been also obtained. It is important to remark that daily outputs have been produced, in contrast with the time resolution of other models, as they are crucial to understand drought responses, which increases the dataset size considerably.

The images below show monthly series of Actual Evapotranspiration and Evaporative drought index.

WP4. Integration and analysis of results.

WP4 can be considered as the meeting point of all the efforts and achievements from previous work packages.
First of all, the capabilities of the Big Data platform were successfully checked and then some project indicators were implemented. The indicators implemented deal with spatial data (images) as inputs for the algorithms. From the point of view of the technological aspects of the framework, the treatment of the images is considered as an ETL (Extract, Transform, Load) process.

On the other hand, decreases in ecohydrological function (possibly associated with losses of resilience) were quantified using ecosystem function variables like evapotranspiration and land surface water deficit indicators. We detected and interpreted how changes in the demand and supply for evapotranspiration are affecting the ecohydrological function across all the bioclimatic regions in Spain over the past 15 years. On the other hand, some key indicators were generated for the non-irrigated wheat growing areas in Cordoba (Spain) during the growing season.

Conclusions showed that there is a general increase in the atmospheric water demand with greening in several regions. Even though the evapotranspiration ratio might slightly increase, it is not enough to compensate such increases in evaporative demand, indicating that a very large part of Mediterranean ecosystems are under stronger water stress, and the watersheds are experiencing reduction in the water yields (potential runoff) except those in the North with more temperate climate.

Milestones update

November 2020

In WP5, there was progress in milestone M5.2 (Published at least 5 international papers and attended 5 conferences) even though only 2 conferences instead of the minimum of 5 were attended. However, DTU and UPM are leading the submission of scientific articles regarding the integration of results and new dissemination activities are planned after the project execution.

Also in WP5, M5.3 (Secured at least 1 additional valorisation opportunity) was also reached, as DTU managed to secure a valorization opportunity in which all partners can participate as new spin off projects for this institution. This happened in relation to the modeling of droughts and the developed datasets, which have rised the interest of IF (Chinawatersense) and DANIDA, resulting in participation in new innovation projects.

May 2019

In WP4, milestone M4.1 (Implementation of WP2 and WP3 techniques in the Big Data framework) was tackled by performing a comprehensive test on the capabilities of the framework deployed through the implementation of some preliminary calculations. Afterwards, the algorithms composing one of the ecohydrological indicators of the project was fully implemented in the platform.

Also in WP4, milestone M4.2 (Assembled data and configured framework for scenario simulations) was reached as a framework built on top of the technologies selected in WP1 was properly deployed. For this purpose, a few open source tools were employed in order to define a set of automation fluxes for Big Data software processes deployment. An important hardware infrastructure was required to cope with the deployment of the whole package of technological tools selected in the project.

November 2018

In WP2, milestone M2.3 (Preliminary set of data-driven modelling techniques implemented) was reached through the implementation of different prediction algorithms to the historical wheat yields from the Cordoba province and subsequent evaluation. The most appropriate ones were then selected according to interpretability, efficiency and generalizability criteria.

May 2018

In WP1, milestone M1.2 (Indicators defined) was partially completed through the definition and description of a list of indicators, but it’s not a closed list as the implementation of these indicators in the framework and further results obtained could give raise later on to more complex and advanced indicators. Also in this WP, milestone M1.3 (Architecture deployed) was reached, even though the unexpected technical difficulties found during the architecture deployment.

In WP3, milestone M3.1 (Improved protocol to estimate ET and WUE with EO data) and milestone M3.2 (Geospatial database with input and output EO variables), were both achieved. An improved protocol to estimate evapotranspiration with EO data was achieved as DTU improved the modelling scheme for evapotranspiration (ET) and Gross Primary Productivity (GPP) using MODIS satellite vegetation indices and surface temperature.  Prior to that, a geospatial database was created for gridded input variables, taken most of the working time during this period.

January 2018

In WP2, milestone M2.2 (Semi-finalized set of data mining techniques developed) was achieved. To meet the FORWARD objectives, anomaly detection algorithms for time series data, clustering techniques for grouping locations with similar anomalies, regression techniques for accounting for relationships between ecohydrological variables and climate data were tested for the Iberian Peninsula.

October 2017

In WP1, milestone M1.1 (Data sources defined and integrated) was successfully completed with the contribution of all the members and a laborious work of researching in data sources and structures.

August 2017

In WP5, milestone M5.1 (Set-up website) was achieved with the development of the present website.

July 2017

In WP2, milestone M2.1 (All tools/techniques/data inventorized/collected) was achieved. In defining data mining and modelling context, a detailed literature review on data mining techniques was made. Some of the core datamining techniques discussed include clustering, association, classification, anomaly detection and regression.

Work plan

FORWARD project is divided into five work packages:

WP1. Big Data framework: data management and processing.

This WP will focus on data gathering, pre-processing and integration. Novel integration approaches based on Big Data technologies and not used by the traditional scientific community will be created according to the volume, velocity and variability of the sources of information.

WP2. Data-mining and data-driven modelling.

Data mining and data-driven modelling techniques will be explored to quantify variability, extremes and resilience in space and time.

WP3. Historical Process-based models using EO data.

Historical process-based modelling using EO time series. Satellite and other spatial climatic databases compiled in WP1 will be used as input into process-based model to estimate evapotranspiration and water use efficiency (WUE).

WP4. Integration and analysis of results.

A final integrated analysis will be performed studying and validating the sensitives across regions, regimes, and vegetation types. After completing all WPs, an integrated framework will emerge indicating how extreme and variable climate conditions affect the eco-hydrological behaviour of ecosystems, highlighting spatial differences across regions and response changes due to local factors.

WP5. Communication and dissemination.

To ensure that the objectives and approach of FORWARD address the stakeholders’ needs, to maximize dissemination of the outcomes of FORWARD and to create the potential of making the tools of FORWARD industry standard in resilience management and Big Data.


“The authors would like to thank the EU and (Centre for the development of Industrial Technology (CDTI), Innovation Fund Denmark (IFD) and Flanders Innovation & Entrepreneurship (VLAIO)) for funding, in the frame of the collaborative international consortium FORWARD financed under the ERA-NET Cofund WaterWorks2015 Call. This ERA-NET is an integral part of the 2016 Joint Activities developed by the Water Challenges for a Changing World Joint Programme Initiative (Water JPI).