Saturday, March 26, 2022

GEOframe essentials

 GEOframe is a system for doing hydrology by computer that aims to implement the DARTHs paradigm [Rigon et al., 2022]. By saying that it is a system, we emphasize that it is not a model but an infrastructure that can contain many differentiated modelling solutions (some tens of that) that are built upon model components [Argent et al., 2004]. This is because GEOframe leverages theObject Modelling system-framework (v3)[David et al., 2013] that allows to connect modelling components to solve a specific hydrological issue together and having many alternatives for its mathematical/numerical description. This infrastructure allows adapting the tools to the problems and not vice versa [Rigon et al., 2022]. In GEOframe particular attention has been dedicated to allow enhancements and additions writing the least code possible. The core code has been designed to open to addition and closed to modifications [Gamma et al, 1995], thus allowing stability of the code base over time.  GEOframe contains tens of components that cover rainfall-runoff [Formetta et al., 2011], snow modelling [Formetta et al., 2014] evaporation and transpiration[Bottazzi et al., 2021], infiltration [Tubini and Rigon, 2022], terrain analysis tools [Abera et al., 2014], interpolation models [Bancheri et al., 2018], calibrations tools [David et al., 2013], and so on. Every modelling paradigm is included, as, for instance process based modelling [Tubini and Rigon, 2022], lumped modelling [Formetta et al., 2014b], machine learning [Serafin et al., 2021], or can be included by adding appropriate components [Serafin, 2019]. Spatially disjoint catchments can be modelled separately and joined together in a bigger model by using a Groovy-based domain specific language. GEOframe has been applied to hydrological simulations from the point scale, to Alpine catchments [Abera et al., 2017], to large catchments as the Blue Nile [Abera et al., 2016], and among those is being deployed to the Po river. GEOframe is open source and built with open source tools including Eclipse, OpenJDK by Adoptium, Gradle, Github. Literate computing is pursued by extensively using Jupyter Notebooks for creating the input and the output of data. 

At Present GEOframe has three main branches: 
  • GEOframe-NewAGE [Formetta et al., 2014] for the modelling of hydrology as a set of systems of systems of ordinary differential equations called Hydrological Dynamical Systems [CITE]; 
  • WHETGEO (Tubini and Rigon, 2022) that solves the Richards, heat and transport equations in soil and groundwater; 
  • GEOSPACE  which deals with soil-plant-atmosphere interactions. 

Many of the components, however, are shared among the various branches and "mixed" modelling solutions can be envisioned by choosing components from one or the other. In fact, for instance GEOSPACE is built upon WHETGEO and GEOframe ET components with the addition of a broker component that transmit and receive data from the to components subsets. For any of the sub-branches please refer to the respective blog pages.

References

Abera, W., A. Antonello, S. Franceschi, and G. Formetta. 2014. “The uDig Spatial Toolbox for Hydro-Geomorphic Analysis.” In Geomorphological Techniques (Online Edition), edited by British Society for Geomorphology. British Society for Geomorphology.

Abera, Wuletawu, Giuseppe Formetta, Luca Brocca, and Riccardo Rigon. 2017. “Modeling the Water Budget of the Upper Blue Nile Basin Using the JGrass-NewAge Model System and Satellite Data.Hydrology and Earth System Sciences 21 (6): 3145–65.

Abera, Wuletawu, Giuseppe Formetta, Marco Borga, and Riccardo Rigon. 2017. “Estimating the Water Budget Components and Their Variability in a Pre-Alpine Basin with JGrass-NewAGE.Advances in Water Resources 104 (June): 37–54.

Argent, Robert M. 2004. “An Overview of Model Integration for Environmental Applications —components, Frameworks and Semantics.” Environmental Modelling and Software 19 (3): 219– 34.

Bancheri, Marialaura, Francesco Serafin, Michele Bottazzi, Wuletawu Abera, Giuseppe Formetta, and Riccardo Rigon. 2018. “The Design, Deployment, and Testing of Kriging Models in GEOframe with SIK-0.9.8.” Geoscientific Model Development 11 (6): 2189–2207.

Bottazzi, Michele, Marialaura Bancheri, Mirka Mobilia, Giacomo Bertoldi, Antonia Longobardi, and Riccardo Rigon. 2021. “Comparing Evapotranspiration Estimates from the GEOframe-Prospero Model with Penman–Monteith and Priestley-Taylor Approaches under Different Climate Conditions.WATER 13 (9): 1221.

David, O., J. C. Ascough II, W. Lloyd, T. R. Green, K. W. Rojas, G. H. Leavesley, and L. R. Ahuja. 2013. “A Software Engineering Perspective on Environmental Modeling Framework Design: The Object Modeling System.” Environmental Modelling & Software 39 (c): 201–13.


Formetta, G., S. K. Kampf, O. David, and R. Rigon. 2014. “Snow Water Equivalent Modeling Components in NewAge-JGrass.Geoscientific Model Development 7 (3): 725–36.

Formetta, G., A. Antonello, S. Franceschi, O. David, and R. Rigon. 2014. “Hydrological Modelling with Components: A GIS-Based Open-Source Framework.Environmental Modelling & Software 55 (May): 190–200.

Gamma, Erich, Richard Helm, Ralph Johnson, Ralph E.. Johnson, and John Vlissides. 1995. Design Patterns: Elements of Reusable Object-Oriented Software. Pearson Deutschland GmbH.

Rigon, Riccardo, Giuseppe Formetta, Marialaura Bancheri, Niccolò Tubini, Concetta D’Amato, Olaf David, and Christian Massari. 2022. “HESS Opinions: Participatory Digital Earth Twin Hydrology Systems (DARTHs) for Everyone: A Blueprint for Hydrologists.Hydrology and Earth System Sciences Discussions, 1–38.

Serafin, Francesco. 2019. “Enabling Modeling Framework with Surrogate Modeling Capabilities and Complex Networks.” Edited by Riccardo Rigon And. Ph.D., University of Trento.

Serafin, Francesco, Olaf David, Jack R. Carlson, Timothy R. Green, and Riccardo Rigon. 2021. “Bridging Technology Transfer Boundaries: Integrated Cloud Services Deliver Results of Nonlinear Process Models as Surrogate Model Ensembles.Environmental Modelling and Software[R] 146 (105231): 105231.

Monday, March 14, 2022

Nature-based Solutions for climate adaptation and flood resilience Modelling and Data available for an optimal Water Management and Nature-Based Solution implementation

I was involved in discussing and supporting the search for Nature Based Solution in the Samaria Park in Crete by TAIEX.  With my talk I tried to refocus on some aspects: 

  • We have to understand which is the Nature we have to deal with
  • We have to understand Phenomena we want  to contain


In order to answer these questions we need data, and I dealt with some of the data requirements for solving hydrological problems. Collecting data and making them available is a relatively less expansive action than doing work and can save a lot of efforts and money.  Furthermore there are social aspects of the question that cannot be overlooked. We made the experience of LIFE FRANCA that opened our view on this this topic. Clicking on the figure above, please find the presentation. At least, I think it is a nice overview covering some recent studies on the island. From them you can have more detailed information. 

Other presentation and information can be found at the TAIEX website.

Wednesday, March 2, 2022

Models Classification according to their interfaceability (MaaA, MaaT, MaaS, MaaR, MaaC)

In the recent submitted manuscript about DARTHs (Digital eArth Twins of Hydrology) we delineated five categories of models in a possible increasing adaptability to be part of a DARTH or of a DARTH component:

  • MaaA,  Model as an Application
  • MaaT, Model as a Tool
  • MaaS, Model as a Service
  • MaaR, Model as a Resource 
  • MaaC, Model as a Commodity

I'll try to explain what the acronyms mean here below. The characteristics listed, it should be remarked, are not connected to the domain science contents of models but to to the software architecture  characteristics and requirements. 

Tom Hagen's photo click on it for more

MaaA - Model as an Application 

MaaA -  Model as Applications - Full fledged models that have close architecture, that includes the data formats and the visualization tools. What follows for MaaA is taken with little  modification from Rizzoli et al. (2006).

  1. A MaaA  bundles data,  algorithms  and  the  graphical  user  interface  of  a  model  in  an  application. This  makes the model very hard to re-use out of its original context. Most MaaTs are also monoliths composed by hundreds of thousands lines of code. 
  2.  A MaaA  works just on one operating system MS Windows, Mac OS or Linux. 
On MaaA, Knoben et al. (2021) provides the following description:

"These tools are typically provided as self-contained packages. Packages tend to be easy to use for their intended purpose but take time to understand and do not necessarily provide much flexibility to deviate from their intended purpose. Layering additional functions on top of an existing package or modifying a package’s source code is certainly possible, but can be outside the comfort zone of many users."

Other usual characteristics of MaaA are:
  • Applications  evolution  is  totally in the hands of the original developers. This is a good thing for intellectual property  rights  and  in  a  commercial  environment,  but this  is  absolutely  a  bad  thing  for  science  and  the  way  it  is  supposed  to  progress.  Independent  revisions  and  third-party  contributions are nearly impossible. 
  • MaaA often do not come with associated data sets for testing. Moreover, the adoption of object-oriented programming,  while  it  is  a  good  thing  for  model  reusability  and  portability,  it  makes  things more complex for testing, because of a number of problems such as observability in  virtual  method  calls  and  state  dependent  behaviour  of  objects.
  • The way they are coded  (as monolithic entities)  displays  a  strong  level  of  internal  cohesion,  and,  if  a  modeller  is  interested  in  reusing  a  particular  function  within  a  bigger  model,  they  can  find  it  very  hard  to  isolate  and extract it, given the strong dependencies existing in the source code parts. 
  • Their data formats  do not come from a community agreement  and, their developers  typically decide to have output data in a format relevant to their own application, which may not be a format that is widely used by others. It is cumbersome for developers to have their tools ingest multiple different data formats and such functionality is therefore somewhat rare (slightly modified from Knoben et al 2021). 
It could be observed that a MaaA could be evolved to eliminate the various characteristics  in the bullet list. In fact there exists a variety of MaaA which, especially recently, pursued such achievement (modular code, open to common data formats, separation between the graphical model interface and the rest of the code). 

MaaT - Model as a Tool (Mainly From Nativi et al., 2021)

In MaaT, differently from MaaA there areat least two level of abstraction: the interface is abstracted from the model, in a client-server way,  and the model is loosely coupled to the data. The interacting tools is distinct from the model itself and can eventually be changed. 

However, a given implementation of the model runs on a specific server, and the interact with the model through the user interface.   In a MaaT  the models are preloaded on a specific machine. Besides, it is not possible to modify the interaction between the server and the client which is kept fixed by the user interface. 
Benefits include a strong control of the model use and execution (which could be useful to control what happens in a operational service).  There are limitations on the usability and  flexibility of the model, as well as its scalability due to the limitation of the specific server. Machine-to-machine interoperability (chaining capabilities) is not allowed. Knoben et al (2021), without knowing the acronym, defines MaaT well when talking of some web-based services: "... several of these tools are provided as web-based services. This can be appealing because, for example, data can be pre-downloaded to speed up model configuration and model simulations can be easily shared. The advantage of such approaches is that they can be combined with some  form of server-side data transformations (e.g., subsetting or averaging), which minimizes  data transfers. Storing the inputs for and outputs of large-domain simulations can, however, be cumbersome, and keeping pre-downloaded data up-to-date and sufficient for all user needs takes sustained, long-term effort. A further complication is that it is regrettably common that such web-based services require some form of manual interaction with the webpage, limiting opportunities to automate data acquisition tasks".  
In a MaaT there is a certain level of abstraction is implemented to make the data and the models loosely coupled but MaaTs do not necessarily provide tools for automation of data acquisition. However, it can be said that the models' core in a MaaT is agnostic with respect the data source and formats (for a detailed explanation see Knoben et al., 2021).  
In an open MaaT,  
  • model evolution should be in the hands of a community
  • models should come with an appropriate set of tests both for the informatics and the physics. 
  • a modular structure for the code should be the rule 
  • Tools for data brokering should be available 

MaaS - Model as a Service  (Mainly From Nativi et al., 2021 and David et al, 2014)

A “Model-as-a-Service” provides the capability to execute simulation models as a service. As Wikipedia reports: "In the contexts of software architecture, service-orientation and service-oriented architecture, the term service refers to a software functionality or a set of software functionalities (such as the retrieval of specified information or the execution of a set of operations) with a purpose that different clients can reuse for different purposes, together with the policies that should control its usage (based on the identity of the client requesting the service, for example)."
As for the previous case of MaaT, a given implementation of the analytical model runs on a specific server, but this time, APIs are exposed to interacting with the model. Therefore, interoperability consists of machine-to-machine interaction through a published API, e.g., for a run configuration and execution. Nevertheless, it is not possible to move the model and make it run on a different machine (without having to "manually" install the model and its managing software on these machines).  Concerns deal with a still limited flexibility and possible scalability issues (depending on the server capacities). To note, this time, the existence of possible concerns for less control on the model (re-)use.

There are two main usage patterns: (i) The model can be pre-deployed, has a well-known service endpoint, and is supported by supplemental data services. This is quite common for operational models used in a production environment. Moreover, (ii) the model can be dynamically deployed from the client before execution (implying that a MaaS is made up of a pool of modelling components that can be linked just before run time with a scripting language). Model service development for research purposes needs such a behavior. Both approaches address a different workflow, need for availability and security. A certain model execution method may also be specified in such a service.

MaaR - Model as a Resource 

The interoperability level resamples the same patterns used for any other shared digital resource, like a dataset.  
  • This time, the model itself (and not a given implementation) is accessed through a resource-oriented interface, i.e., API and 
  • a software infrastructure layer manages (with some constraints whose invasiveness should not be relevant) a set of compliant models
  • That allows to effectively move the model and make it run on the machine that best performs for a specific use case. 
Cloud services can distribute the model runs on various architectures (like cloud services, high performance computing machines, multicore machines, cluster of computers) dynamically adapting the request of resources to the demand. 
There are clear benefits in terms of flexibility, scalability, and interoperability. The main concerns, maybe,  are about the model sound utilization.

MaaC - Models as a Commodity

They are MaaS or MaaR that in addition have some controls on the Science and their explanation. 

Differently from the other, previous, classifications the Model as a Commodity definition does not imply information technology issues but programming and  science issues that are related to the DARTHs working. Are  MaaC models a mass-produced unspecialized product (the meaning of commodity) ? They are obviously not but their use inside Digital Earth Twins, once they will be produced, will be like they were such. The most of the people who will access it, will do for taking decision on other aspects of science and social life. Therefore the hydrological modelling in DARTHs has to acquire some features that make their use more safe and less prone to introduce fake information in the public. This is envisioned in providing DARTHs with error (uncertainty) estimations for all the quantities hindcasted and forecasted.  The topic is difficult one with a large amount of literature (e.g. Beven, 2016), often difficult and obscure (e.g. Nearing et al., 2016) and the requirement here of having a quantification of uncertainty does not enter the dispute of the origin of errors, while staying on Cox (1946) statement that "Purely empirically, probability and statistics can, of course, describe anything from observations to model residuals regardless of the actual sources of uncertainty as an expression of our reasonable expectations" (taken from Beven, 2016) that, at least an empirical estimation of the error on the base of recorded data is possible. 
Because modelling require a first phase of training/calibration/ on the past, them error of modelling must include an analytic performance over the past data of the model. Therefore a MaaC is  a MaaS or a MaaR provided with error estimations on any of the quantity hindcasted or forecasted, a warning for the use of any quantity and a major effort for modellers. 

The MaaCs inherit form MaaS and MaaR their composable structure, however with a purpose.  Components are self-contained building blocks, modules or units of code. Each well designed component usually implements a single modeling concept. Multiple algorithms can be implemented within the same component or in various components, and inserted in modeling solutions as alternatives, thus opening the way to compare, inside the same chain of tools, different approaches. This respond also to a science requirement, i.e. the idea that model should be used in DARTHs as hypothesis to be tested among various possibilities (Clark et al, 2011). This flexibility will be usually not directly available to the more unaware end-users, but will be certainly useful for scientists to provide more reliable modelling. Therefore MaaC requires tools for supporting the workflow of hypothesis testing.  These tools are usually provided at "literate computing" workflows, as those explained in Rigon et al. 2021. 

DARTHs in their essentials are modeling infrastructures that deploy the MaaC paradigm.

References

Beven, Keith. 2016. “Facets of Uncertainty: Epistemic Uncertainty, Non-Stationarity, Likelihood, Hypothesis Testing, and Communication.” Hydrological Sciences Journal 61 (9): 1652–65.

Clark, Martyn P., Dmitri Kavetski, and Fabrizio Fenicia. 2011. “Pursuing the Method of Multiple Working Hypotheses for Hydrological Modeling.” Water Resources Research 47 (9). https://doi.org/10.1029/2010wr009827.

Cox, R. T.,1946. Probability, frequency and reasonableexpectation.American   Journal   of   Physics,  14,  1–13.doi:10.1119/1.1990764

David, Olaf, Wes Lloyd, Ken Rojas, Mazdak Arabi, Frank Geter, James Ascough, Tim Green, G. Leavesley, and Jack Carlson. 2014. “Modeling-as-a-Service (MaaS) Using the Cloud Services Innovation Platform (CSIP).” In International Congress on Environmental Modelling and Software. scholarsarchive.byu.edu. https://scholarsarchive.byu.edu/iemssconference/2014/Stream-A/30/.

Knoben, Wouter Johannes Maria, Martyn P. Clark, Jerad Bales, Andrew Bennett, S. Gharari, Christopher B. Marsh, Bart Nijssen, et al. 2021. “Community Workflows to Advance Reproducibility in Hydrologic Modeling: Separating Model-Agnostic and Model-Specific Configuration Steps in Applications of Large-Domain Hydrologic Models.” Earth and Space Science Open Archive. https://doi.org/10.1002/essoar.10509195.1.

Nativi, Stefano, Paolo Mazzetti, and Max Craglia. 2021. “Digital Ecosystems for Developing Digital Twins of the Earth: The Destination Earth Case.” Remote Sensing 13 (11): 2119.

Nearing, Grey S., Yudong Tian, Hoshin V. Gupta, Martyn P. Clark, Kenneth W. Harrison, and Steven V. Weijs. 2016. “A Philosophical Basis for Hydrological Uncertainty.” Hydrological Sciences Journal 61 (9): 1666–78.

Rigon, Riccardo, Giuseppe Formetta, Marialaura Bancheri, Niccolò Tubini, Concetta D’Amato, Olaf David, and Christian Massari. 2022. “HESS Opinions: Participatory Digital Earth Twin Hydrology Systems (DARTHs) for Everyone: A Blueprint for Hydrologists.” Hydrology and Earth System Sciences Discussions, 1–38.

Rizzoli, A. E., M. G. E. Svensson, E. Rowe, M. Donatelli, R. M. Muetzelfeldt, T. van der Wal, F. K. van Evert, and F. Villa. 2006. “Modelling Framework (SeamFrame) Requirements.” SEAMLESS.