Tuesday, July 26, 2011

Quantifying uncertainty

I saw on EOS an announcement regarding a new website, QUEST, or quantifying uncertainty in ecosystem studies. Hydrology in fact needs an effort to quantify uncertainties, but this is usually ignored.

It is quite a few years that I fly around of this issue, and probably next year I would try to dig a more little deep in literature.

The sources of uncertainty in hydrological modeling are at least three:

- the input data (which can derive from chaotic dynamics)
- the approximation contained in equations
- the parameterizations of constants which can be heterogeneous (highly variable, if not random in space)

When thinking to inputs, the paradigm is rainfall. It is usually estimated just a few points in the domain with large errors. Then, the local estimates needs to be interpolated and extrapolated in space, introducing further errors. Rainfall itself is very irregular in time and space at all of the scales, which means that you can capture just the statistics of its behavior (and this left you out in the cold with other errors).

When looking at flows, i.e. at their mathematical description in equations, one has to think that they are eminently a thermodynamical product where some fluctuations need to be neglected and described with suitably averaged properties, which could be no possible (significant).
Besides, usually the system described is made up of many non-linearly connected subsystems, and in practical implementations the nonlinearities and the feedbacks are simplified or even neglected. Moreover, equations need to be discretized on grid, which introduce itself approximations.

Finally, saying that some processes are governed by heterogeneities, we also state that the information they contain is algorithmically incompressible (e.g. Chaitin), and there is no way to represent it in short strings. The latter syndrome is the one well described by Borges in "The Exactitude of Sciences", but also in Noam Chomsky's book, Rules and Representation, where at page 8 he cites Stephen Weinberg, and goes so deep in asking if we can really know reality, and what it means.

In any case, the hydrological community, started to take care of it from a long time (here it is a recent abstract with hopefully a good literary review, and here, the work by Beven, Gupta and Wagener), but they started to work especially on the assessment of parameter uncertainty (even if GLUE pretends to be of more general validity). A recent assessment is also in this work by Goetzinger and Bardossy which can provide access to further concepts and bibliography.

However, many hydrological models produce just time series, and therefore the uncertainty reduce to understand (and sometimes to compare) a couple of time series: the measured time serie and the modeled time serie. Good hydrological models are those that reproduce the time series with a good agreement. This is quantified, often but not always, with the use of indexes. The mean square error or its root, the Nash-Sutcliffe, the minimax objective function, average absolute percentage error, the index of agreement, the coefficient of determination, are a few of them.

This is certainly a narrow perspective to look at the topic. Both the measured and the simulated series are, in fact, affected by errors, and therefore one should not compare the two series directly , but the time series including their errors. I believe that this would coincide to adopting a Bayesian perspective of the problem (e.g D'Agostini 2003 - Bayesian Reasoning in Data Analysis, A critical Introduction) and will turn into data assimilation (e.g. Kalnay, Atmospheric Modeling, Data Assimilation and Predictability, 2003) with the defect that, at this point, data and models are so entangled that it would be difficult to extricate them (but not impossible, I guess).

We can also observe that a model usually produces more than a single time series. So "a prediction" becomes the "predictions" and the uncertainty spreads in all of them.

Besides, we did not mention, spatial patterns: before we claims for their uncertainty, we have to recognize that we should quantify them. How can we do ? And for extension are we able to identify spatio-temporal patterns ? And therefore when we can decide if two of these patterns are the same (neglecting noises). Indicators of statistical equality would probably give miserable scores if applied to two or three dimensional fields.

Someone has ideas ?

Wednesday, July 13, 2011

New GEOtop presentations given at the Summer School on Surface Hydrology in Marsico Nuovo

Thank you to Salvatore Manfreda for having organized this event which I found fruitful and interesting. General Information on the summer School can be found here. What it is reported below are just the sequence of my seminars: actually a five parts seminar given in four hours.

First hour covers the motivation behind GEOtop, and its structure in terms of grids, equations, and boundary conditions.

The second hour covers the snow modeling (almost all of it, excluding snow compaction). Its equations and boundary conditions.

The third and fourth hours cover Richards equation (above), its extension to deal with saturated and freezing soils, and some material regarding landslide modeling with (and without) GEOtop (below).

The material is far from being complete. But it is better to have them now, a little broken, than never. The Authors ask people using this material to cite the appropriate GEOtop papers appeared in Journal of Hydrometeorology, Hydrological Processeses, and recently on The Cryosphere. Soon the user manual of GEOtop will be available. Continue to watch the blog !

Monday, July 11, 2011

udig got a spatial toolbox

I think to it as a milestone to which I contributed a little during the years, even if the full merit needs to be done to Andrea Antonello and Silvia Franceschi: "the Hydrologis".

The idea was to give a transparent (encapsulated) way to add spatial models to a GIS. As told in previous blogs, the way was found in following the OMS3 framework ideas, after having tried hard with OpenMI. Andrea did more by using OMS3 annotations to automatically create the input-output interface, and manual like help, to any OMS3 compliant module. Any information can be found at

udig spatial toolbox a.k.a. OMS3box. However, do not use the 0.7.1 code but use the more recent one that can be found at the jgrasstools download page.

I do not know if Andrea and Silvia fully realize the importance of what they created. It is a big jump to a new type of GIS where the usual paradigms for connecting models, data and visualization, are suddenly changed.

Researchers can now program their model following the OMS3 lines, and having them fully endowed with graphic I/O, help, without taking care of the details of making it.

Certainly programming a Jgrasstool is still a challenge for novices, and much work has to be done to smooth the learning curve of it. Especially writing manuals ;-)

Great work Andrea and Silvia: congratulations !