Saturday, October 28, 2023

CARITRO Project: Snow droughts e green water: how climate change modifies the hydrological cycle in the Alpine Region.

Due to the impact of climate change, the Alpine region is experiencing a dual effect: a decrease in snowfall leading to snow droughts, and an increase in water losses through evapotranspiration, also known as green water. These changes have significant implications for the sustainable management of water resources and the preservation of ecosystems. This project, funded by the CARITRO foundation, aims to address these challenges by developing innovative models to accurately quantify snow melt and evapotranspiration losses. The ultimate goal is to provide practitioners with user-friendly calculation tools that are more advanced than traditional lumped models but less complex than intricate "process-based" 3D models. Initially proposed by Niccolò Tubini, the project has been taken up by John Mohd Wani with minimal modifications.  


The complete project plan can be found here

Friday, October 27, 2023

Open Science by Design

In the framework of the meeting "Community over Commercialization \, Open Science, Intellectual Property and Data" I was graciously invited by professor Roberto Caso to talk about my experience with developing open source models and promoting open science. Various the topic I tried to rise: the transmission of science in a university environment, why open source coding, why open science, which methodology can be used.


The presentation can be found @ https://osf.io/798vu and if any video record will be available, I will share it. 


Friday, October 20, 2023

Identifying Snowfall Elevation Patterns by Assimilating Satellite- Based Snow Depth Retrievals

Precipitation in mountain regions is highly variable and poorly measured, posing important challenges to water resource management. Traditional methods to estimate precipitation include in-situ gauges, doppler weather radars, satellite radars and radiometers, numerical modeling and reanalysis products. Each of these methods is unable to adequately capture complex orographic precipitation. Here, we propose a novel approach to characterize orographic snowfall over mountain regions. We use a particle batch smoother to leverage satellite information from Sentinel-1 derived snow depth retrievals and to correct various gridded precipitation products. This novel approach is tested using a simple snow model for an alpine basin located in Trentino Alto Adige, Italy. We quantify the precipitation biases across the basin and found that the assimilation method (i) corrects for snowfall biases and uncertainties, (ii) leads to cumulative snowfall elevation patterns that are consistent across precipitation products, and (iii) results in overall improved basin-wide snow variables (snow depth and snow cover area) and basin streamflow estimates.



The analysis of the snowfall elevation patterns' spatial characteristics indicates that the proposed assimilation scheme results in more accurate spatial patterns in the snowfall distribution across the entire basin. The derived snowfall orographic patterns contribute to a comprehensive improvement of mountain hydrologic variables such as snow depth, snow cover area, and streamflow. The most significant enhancements in streamflow are observed during the spring and summer months when peak flow observations align more accurately with the posterior cases than the prior ones. These results primarily stem from the fact that the assimilation of Sentinel-1 assigns less snowfall to the lower-elevation regions of the basin, while higher rates are assigned to the higher elevation. As summer approaches, water is released more slowly from the higher elevation via snow-melt than in the prior case, which aligns better with observations. The assimilation of Sentinel-1 effectively downscales coarser-resolution precipitation products. While the prior snowfall cumulative elevation pattern has a small gradient across elevation bands, these patterns are consistent across elevations and precipitation products after the assimilation of snow depth retrievals. In conclusion, this study provides a framework for correcting snowfall orographic patterns across other seasonally-snow dominated mountain areas of the world, especially where in-situ data are scarce. The full paper can be found by clicking on the Figure above.
Reference


Girotto, Manuela, Giuseppe Formetta, Shima Azimi, Claire Bachand, Marianne Cowherd, Gabrielle De Lannoy, Hans Lievens, et al. 2023. “Identifying Snowfall Elevation Patterns by Assimilating Satellite-Based Snow Depth Retrievals.” The Science of the Total Environment, September, 167312. https://doi.org/10.1016/j.scitotenv.2023.167312.

Thursday, October 19, 2023

Water4All - WaMaWaDit project

The project WaMA-WaDiT: Water Management and Adaption based on Watershed Digital Twins was financed in the Water4All call and therefore, we will be able to start a new exciting adventure with some challenge. 

This proposal aims to understand the impact of extreme climate events such as droughts and floods on water management systems, with the goal of developing optimized management strategies that maximize water security under both current and future climate change conditions. The knowledge gained will be used to create a watershed digital twin framework, applicable to various watersheds with different water-related issues. A guide will be published detailing the process of building digital twins for specific watersheds and problems.



The proposal  that you can find in its complete form by clicking on the above logo, pursues three main objectives: the scientific, the practical, and the product objectives. The scientific objective focuses on improving our understanding of how drought and floods affect water management systems, and how optimal strategies can mitigate these effects. This involves several sub-objectives, such as determining the best databases for modeling water management problems, analyzing systematic errors in climate and hydrologic predictions, improving the inclusion of groundwater dynamics models, incorporating complex snow dynamics, assessing the effect of long-term forecasts of extreme events on reservoir management, and improving the parameterization of single hydrological processes.

The practical objective is to create a methodology that systematizes the proposal and assessment of adaptation measures in reservoirs. This methodology will provide a clear guide on how to develop decision frameworks based on the most robust numerical models or digital twins of the watershed. It will also tackle how to manage hydroclimatic extremes like floods and droughts, emphasizing dynamic management of safety margins to maximize water availability and ways to reduce the impact of persistent droughts.

The product objective is to implement this methodology in a free, open-source software tool that simplifies the use of scientific knowledge for decision-makers and reservoir managers. This tool aims to be robust and scalable, providing a first-order approximation to any problem. It will encourage end-users to adopt optimal tools for their needs by demonstrating the power

Tuesday, October 10, 2023

Notes about the dynamic nature of the GEOframe-Po Project

Here below you can find some provisional notes, to be improved in the next days about our Deployment of the GEOframe system to the river Po for the basin Authority of the river Po.  

Basin extraction

it's not a straightforward operation. In fact, it has never been done systematically all over Italy. It serves two opposing needs: to be objective and to align with the official grid provided by basin Authorities and Regions. The initial phase relies mainly on slope analysis and requires processing digital terrain data, which have become available only in recent years, especially if we refer to data produced with laser altimetry. The starting point is the Digital Elevation Models (DEMs) provided by the regions, which have been reprojected and standardized to correct reference systems. The initiation of the hydrographic networks is determined by an area threshold, while sub-basins, for the Po river, are delineated to have an average area of 10 km2. Procedures have been standardized in geographic information systems (GIS) over the last twenty years, but for this specific task, the Horton Machine library developed by Univrsity of Trento and HydroloGIS was used (Abera et al., 2016, serving as reference), incorporating some innovative elements: a parser to aggregate smaller basins into adjacent larger ones and addressing certain topological situations, especially those in flat areas for the subsequent use with GEOframe.
The tools was named GEOframeInputBuilder.

The extraction of lakes, particularly the large Lombard lakes and Lake Garda, required special attention and made the process less automated. Visual analysis reveals a differentiated geometry between mountain basins and lowland inter-basins, since the early years of fluvial geomorphology, but now objectively observed. The database, now available, enables statistical analysis of their geometry and topology, which previously relied on more qualitative cartographic analysis. The basin initiation with an area threshold is functional to the hydrological modelling but the reader should be aware that this topic is a very alive hydrological research topic, especially along with the work by Gianluca Botter and coworkers [insert CITATION].

The grid, as currently constructed, will be distributed for free use and will serve as a fundamental standard for further cartographic-digital and hydrological analyses and developments.

Photo by Luigi Ghirri



Interpolation

Interpolation techniques have seen significant development between the 1980s and 90s [insert citation], but especially geostatistical methods have slowly made their way into the practice of digital analysis of meteorological forcings in the hydrological cycle. These require the definition of an estimation model of the correlation between measurements, known as a variogram, the robustness of which is fundamental to the reliability of the result.
The starting database is made up of measurements collected by ground stations from regional entities operating on the Po basin. These data have been analyzed, cleaned, and subsequently interpolated, currently on each centroid of the sub-basins identified in the first phase of the work. The interpolation was carried out for precipitation and temperatures on a daily scale, as a first step to produce hourly or sub-hourly interpolation at any point of a suitable one-kilometer grid.
The interpolation technique used was kriging with drift to account for orographic effects, especially on temperature. For the interpolation of the experimental variogram, a ?linear? Exponential? What else? model was used using the interpolators implemented in GEOframe.
The interpolation covered the entire period from 1990 to today, and the data are stored in CSV files in folders containing the data for each individual sub-basin.

It is clear that the procedure is a first approximation that will serve as the basis for future improvements. For example, the extension of the interpolation on the one-kilometer grid is one aspect. The next improvement could be to introduce high-resolution reanalysis data, combining geostatistical techniques with simulations of atmospheric circulation and any data coming from radar and satellite. Convergent research come from atmospheric physics and meteorology whose resolution is arrived at the scales useful for hydrology. Some work should be done for connecting better the two communities.

Setup:

GEOframe-NewAGE allows numerous configurations, as various components are available for the same phenomenon. For the basic configuration of each single Hydrologic Response Unit (HRU), the one already partially tested in [insert citation] called Embedded Reservoir Model (ERM) was chosen, the description of which can be found in the cited bibliography or in the linked videos. In summary, the ERM model is composed of a component for interception, one for snow, when present, a fast surface runoff separator based on the Hymod model, a nonlinear reservoir for the description of the root zone, and a second nonlinear reservoir for groundwater. Structurally, it is not much different from the HBV Model [insert citation]. In the basic configuration, flood propagation is neglected.
For the part of evapotranspiration, a simple Priestley-Taylor model was used, where however the radiation is provided through a rather accurate model [insert citations].
Each of these ERM models was then connected to the others through the Net3 infrastructure [insert citation] to form a directed acyclic graph in which each node represents an HRU. Potentially, each HRU can be characterized not only by its own topographic and meteorological data, but also by its own models.
In the basic configuration, however, the same model structure is usually used for all HRUs while the values of the model parameters are obtained by subsequent calibration with spatially differentiated parameters, if the available data allow it.
The potential setup variants are numerous, encompassing at least three options for snow modeling, three for evapotranspiration modeling, and an array of choices for reservoir modeling. The inclusion or exclusion of flow propagation modules, as well as the potential elimination or addition of compartments to be modeled and their diverse connections, further expand the possibilities. An overview of potential topological configurations is presented, for instance, in [insert MaRmot citation]. As even a novice reader can comprehend, the possible combinations multiply far beyond exponentially with the number of connected Hydrological Response Units (HRUs), which can, in turn, be linked in various manners. This complexity underscores why our comprehensive study on the Po River necessitates distribution and further refinement by others to enhance the precision of the results and better align them with local needs which cannot be gained by a single yet very productive team of people. In turn this open the question on how the re-analysis performed by external researchers or teams can be accepted and inserted back into the main project.

The analysis of multiple configurations is therefore entrusted to later phases of the project.

Calibration

Among the phases of a simulation, the calibration phase is the most time-consuming. It essentially consists of a large number of attempts to combine the model parameters to reproduce the measured data as faithfully as possible. The space of possible parameters is generally very large, even for a single simulation HRU. Therefore, the tools for calibration try to use intelligent strategies (including ML) to quickly guess which are the best parameter configurations.

The goodness of the simulated values' fit to the measured ones is usually quantified through some goodness of fit (GOF) algorithms. In our case, these are generally the KGE [insert citation] or the NS [insert citation]. An analysis of the various GOFs can be found in [insert citation], whose result can be further detailed, in the validation phase (see below), with additional indicators such as those presented, for example, in Addor et al., 2017. Another method of analysis, post-hoc of the goodness of the simulations, much more refined, is that presented in [insert Shima work citation]. The latter can also serve as a Bias corrector of the final result and it is going to be systematically applied to the results of the Po project.

From an algorithmic point of view, the calibration carried out in the project is based on the LUCA model [insert citation], which is a sophisticated implementation of SCEM-UA [insert citation], but a particle swarm [insert citation] could also be used. The calibration procedure follows some standards. Having a set of data to base the calibration on, the data are usually divided into two subsets, one used for calibration and another for the so-called validation phase. In the former, the problem of having available input and output data is solved, determining the parameters (or models) in a way similar to what is done in normal ML techniques (which, for this purpose, could probably be used profitably). In the latter, the performance of the model solution on data not used for parameter determination (and should be "independent" of the former) is evaluated. As already mentioned, in the validation phase, additional GOF indicators can be used to better discern the performance of the adopted solution.

A note concerns the word "validation". This is the term used but does not imply any ontological meaning about the nature of the truth described by the model, but only a practical meaning related to the reliability of the model in predicting a certain sequence of numbers.
The calibration/validation procedure can be implemented for a single variable, in the specific case, usually the flow in a section of the hydrographic network, or for more variables, for example, snow cover, soil water content, evapotranspiration, if these measurements are available. These latter possible measures, however, have a different character from the discharge as, while discharge is an aggregate variable, resulting from the concentration of the fallen water on the watershed area in a single point, the others remain variables distributed spatially, before being aggregated for the purposes of the watersheds budget, and therefore the methods of determining the goodness of reproduction of the measured data follow more articulated paths, if not more complex. The good thing is that GEOframe allows you to calibrate the various quantities separately, as each of them is modeled by "different components" that can be used separately from the overall model. The use case is performed throufh quite a lot of manual intervention so far and could be made more automatic.

In any case, if the target variables are more than one, we speak of multi-objective calibration, while if there are variables measured at multiple sites, we speak of multi-site calibration [insert citation].

I would like further to suggest an enhancement to our analysis and move from the daily to hourly time scale. This is particularly crucial for understanding processes within smaller watersheds, approximately on a 1km^2 scale, where many significant phenomena demonstrate sub-daily dynamics.


Simulation/ Analysis/ECP

The validation phase is already a simulation stage (with predetermined parameters) and represents the normal completion of operations in a production phase. This production phase is usually understood in the hydrological literature as hindcasting, that is, as functional to the understanding of past events for which an explanation is sought in a quantitative framework. This involves the use of more accurate analysis and indicators than those used in the calibration/validation phase which require a certain speed. One of these is the analysis through empirical conditional distributions, as illustrated in Azimi et al., 2023. These analyses can eventually lead to a rethinking of the setup and calibration/validation phases in order to obtain more accurate results. As shown in Azimi et al (2023, 2024), ECPs can also be used as bias correctors and improve the overall statistical performance of the model's results, at least if it shows a certain stationarity of temporal behavior, that is, if, for example, the effects attributable to global warming do not significantly impact the structure of the model (including its parameters). The determination of the "reliability" of the models is then a key concept in the development of digital twins of the hydrological system (Rigon et al, 2022).

Another matter, and much less frequented by hydrologists, is that of forecasting future events. These future events, obviously, have to do with the time series input to hydrological models and therefore require forecasts of precipitation, temperature, wind speed, and air humidity. It is known that the meteorological system (global and local) is affected by a lack of predictability (predictability) due to deterministic chaos effects [insert citation]. To date, weather predictions have reliability, with respect to weather categories, of a few days, they have the ability to predict atmospheric temperatures, but they are still very imprecise in determining the amount of precipitation, in essence, they can be used to predict the hydrological future but with questionable quantitative value. The theoretical reason for this debacle has been somewhat said, but there are also others, for example, the heterogeneity of ground conditions and the absence of a description of the soil-atmosphere feedbacks, both conditions not described in meteorological models. Hydrological forecasts can therefore only be of a statistical nature and produce scenarios [insert citation], which are not devoid of meaning and practical value. In this area between Hydrology and meteorology the search for a common ground is mandatory for any evolution. In GEOframe, however, the input data treatment/modelling is quite well separated from the hydrological computation and any new source of data can be easily (but not without person/months work) included.

Distribution of results and participatory science

A fundamental aspect, already widely discussed in Rigon et al., 2022, is to understand how the results of a model can be shared with various users, but also how the model, its setup (including a very expensive phase of DEM analysis, weather data interpolation, and calibration/validation) can be shared, saving other researchers time. GEOframe is built in such a way that this is possible (share ability is by design of the informatics) and some experiences have already been made in this sense. Some within the Trento working group, others with external research groups from the University of Milan (whose work is to be incorporated) and the Polytechnic of Turin, where the basic data and models already pre-digested by the University of Trento served for further developments and analysis on some parts of the Po basin already processed.
The question on how to preserve, make use of multiple contributions to code, data, simulation configurations and simulations, is still open though.
It should be clarified that the GEOframe system is not only a set of data and models, but also a library of analysis tools, especially developed through Python Notebooks and often documented through a series of slides and video lessons [add the links here] and Schools [https://abouthydrology.blogspot.com/2021/10/the-geoframe-schools-index.html]. Although this system can be improved and automated, it has allowed the group from the Polytechnic of Turin to dramatically shorten the modeling times of a series of basins in Piedmont and will allow, for the moment in the planning stage, the sharing of the setup and analysis of the Lombard area of the large Alpine lakes. Other analyses, developed in parallel on areas such as Friuli by the University of Udine, can easily be inserted into a possible national system that covers all of Italy, even though they were developed separately.
From the informatics point of view organizing all of this information through appropriate repositories would be mandatory in the future for an effcient use of the resources.

Conclusions

The GEOframe-Po project is more than just a collection of models; it envisions a comprehensive system that encompasses a variety of input and output datasets, model configurations, and the flexibility to operate on diverse platforms such as laptops, servers, and the cloud (leveraging the OMS/CSIP platform). The interfaces, as evidenced by the available self-instruction materials, can range from simple text-based designs to more sophisticated visual tools, including augmented reality devices.
The system is designed for continuous improvement and customization, with the ability to implement changes with minimal overhead. This was a strategic requirement pursued at various levels of the information technology aspect of the project [insert citations]. The current models can be broadly categorized as physically based, with the majority of the implementation comprising what is referred to in literature as "lumped" models. However, the system is designed to accommodate extensions to more distributed models, a possibility that has already been partially realized in some research lines of the Trento group.
The integration of machine learning techniques into the system is also possible [insert citation], even though they have not been utilized to date. The design of the GEOframe-Po project, therefore, represents a flexible, adaptable, and forward-thinking approach to modeling and data analysis.