Tuesday, September 3, 2013

The JGrass-NewAGE informatics

As I tried to convey in previous posts, since more than five years, I am working to the idea to built a hydrological model by components. Well, this should have been the first paper in row, but as often happens, is the fourth to have been submitted.

The paper contains, by Formetta et al., the main ideas behind this type of modelling and shows that the system actually works, it is not just a matter of  speculations. We did it. At the moment it is at a late stage of review on Environmental Modelling & Software, and we eventually ask for it being open access.

The rational of the paper is expressed in its introduction:

"Scientists demand more and more the availability of simulation model’s source code since it has become a key factor for the understanding, validation, and advancement of science (e.g. Ince et al. [2012]). However, this is not enough, even if the source code would be available, the growing complexity of modelling code makes model development progress challenging to understand and manage. In fact, if model code distribution is matter of policies (e.g. Annan et al., [2013]), external inspection and analysis of models, improvement and contribution are difficult or even impossible when the software is inadequately engineered. The implementation of many environmental processes intimately interlinked (as snow, runoff production, evapotranspiration, in the hydrological case), is usually difficult to understand per se, but models writing in traditional monolithic forms, as defined in Rizzoli et al. [2005], makes their implementations overwhelming hard to follow, and models themselves practically impossible to be verified (Quesnel et al. [2009]). As a matter of facts, the traditional modelling practices preclude easy understanding, rapid reuse and improvement of the source code, and eventually obstacle seamless advancement in science.

Part of models’ obscurity has its foundation in bad documenting practices, and many researchers’ community that rely on computational methods and techniques as part of their day-to-day activities proposed shared protocols, like the so called Overview, Data concepts, Details method (ODD, e.g. Grimm et al., [2006]), to improve documentation effectiveness. However, “reproducible-research systems” (RRSs), making easier to document any step during research from model implementation and data preparation to output analyses, would greatly help correct policies to be adopted. Actually from a RRS system we would expect, besides model codes sharing, tools to allow the researchers, on one hand, to repeat the simulations in the same conditions and, on the other hand, to spend more time on their own science.

Many of software infrastructures or modeling frameworks (MF) were actually designed and built to streamline the process of a sound scientific production (e.g. Wesselung et al. [1996], Argent [2005], Rizzoli et al. [2005]). Among those that specifically target the support of hydrological modelling are the Spatial Modelling Environment (SME, Maxwell and Costanza [1997]), The Invisible Modelling Environment (TIME, http://www.toolkit.net.au/Tools/TIME) and hydrological derivative tools like, E2 (Argent et al. [1999]), OpenMI (http://www.openmi.org/), Moore and Tindall [2005]), and the Object Modelling System (OMS, David et al. [2002, 2013]), Common Component Architecture (CCA, Bramley et al. [2000]) and Earth System Modeling Framework (ESMF, Hill et al. [2004]).

However, most of the above MF require a quite significant learning curve that not all scientists, even proficient modellers, are willing to make.
Therefore, in order to ease the transition into modern programming environments, some modelling efforts and projects recently focused on providing code generation support and reducing the invasiveness of frameworks (Lloyd, 2010) into the model. Especially the third version of OMS and the BioMA project (BioMA, 2012) revealing promising perspectives.

A RSS would not be complete without including data visualization. Gardner and Manduchi (2007), among others, emphasize that in order to optimize scientific productivity, a RRS infrastructure should include not only the computational cores but also visualization and data-processing tools necessary to synthesize knowledge from high volumes of inputs and outputs.

Indeed, tools of choice for the visualization of hydrological processes have been for a long time Geographic Information Systems (GIS) (Maidment [1993]; Grayson et al. [1992]). However, traditional GIS are usually designed for managing static, non- temporal information layers. They are not designed to interact with the dynamic modelling (e.g. Burrough et al. [1998]; Wesselung et al. [1996]). In fact, the interaction between models and GIS can be described as “off-line” and it is performed with integration strategies that affect either the functionality of GIS tools or the usability of models.

Instead, the MF listed above offer a proper abstraction to streamline the interaction with a GIS. They promote the separation of the model into well defined modules or components, each with a well-defined way to interact with others through specified interfaces. Through their interfaces the modules can communicate and exchange data.

Therefore it is also timely for GIS and hydrological model components to constitute a pool of interoperable tools that can be blended together to create software that is accurately tailored to geosciences."

The revised version of the paper prior to publication, in pdf format, can be retrieved from here (or clicking below the picture). The final version is here instead.

2 comments:

  1. I would be very interested in reading this, though unfortunatly the link leads to a dropbox 404 error, at least it does for me!
    Could you recheck?

    Greetings from Germany,
    Laura

    ReplyDelete
  2. I fixed the link. Now the paper is downloadable. Please remind that all about OMS and JGrass-NewAGE will be matter of the school at: http://abouthydrology.blogspot.it/2013/08/a-school-to-lear-hot-to-model-with.html

    ReplyDelete