Thursday, February 23, 2012

Which hydrological model (is better ) ? - Q & A

A few months ago, I received an email where the following questions were posed, in the context of simulating the discharge of many (small) catchments in Spain. The author thought, and I agree with him, to generate a sequence of meteorological forcings with some models he did not specified^1 to produce time series of hydro-meteorological data with assigned statistics, and then use them to drive some hydrological model. To this respect he asked:

"Q1. Will you suggest, from the point of view of computational time, to use distributed models (like SHE) and continuous, since we think to use weather time series of  thousands of years ? Personally I see the danger to be overwhelmed by data, and by so long computational time that we will not able to perform all the analysis we require with the adequate rigor (sensitivity analysis, and so on ...). "

A1 - Different people have different ideas of what a distributed model is. Kampf and Burges (2007) offered a review a few years ago. However, taking as reference our GEOtop, that is probably one of the more complex existing hydrological models, we can observe that it runs, in our laptop, a year long simulation for a 10-20 square kilometer basin at 10 m of resolution, in, say, a day. So, simulating 1000 years would require approximately 3 years: which is clearly too long for any project. Using faster machine would probably increase the time by a factor of two. GEOtop is not parallelized, so, after an investment in rewriting the code, we could probably cut the time of simulation of a factor 100, by using also large parallel computers. Thus, we will reduce one year of simulation to 3/4 days: this could then be feasible. But this is obviously wishful thinking. Other models, like SHETRANDHSVM, tRibs could be probably be already faster than GEOtop, but possibly more inadequate than GEOtop to simulate some of the processes. Besides, timing above does not include the (wo)men/months required for characterizing the basin and the data collection (and organization), which would also use other time.  So, at present, would be impossible to use GEOtop for such a task, maybe  some other optimized model, in a multiprocessor server, could. It remains, however, a long term objective to pursue for us. ^2

Q2.  Would it be more feasible to consider lumped or semi-distributed model ?  In this case, considered that the interest consists in describing only the hydrological behavior during floods. Which kind of infiltration scheme would you suggest ? Would you suggest a model like the Sacramento Soil moisture Accounting model ? Would it be enough accurate in its forecasting ? O would it be better to use some model based on the solution of Richards equation, maybe using a space averaged characteristics curve ? 

A2 - I gave the answer responding to Q1:  said that any model is, in a sense, a conceptualization, you need to choose an approach more light than GEOtop. I responded to this question also in my presentation given in Montpellier.  So, you need to use a semidistributed model.
Using a solver of Richards equation would make you essentially return to option of Q1 with its problem, unless you think to a 1-D solver coupled with a 2D groundwater equation. In any case you would further be required to have a module for runoff and channel routing to patch together with subsurface flow modules. Excluding that you can do it by yourself, you need to find the model that already does it. tRibs, or TopoFlow should, for instance, fall in this category. We are working toward a similar solution with our integrator of the Boussinesq equation: but is still "work in progress".
Thinking to semidistributed models for solving your problem, you need to look at a model like our JGrass-NewAGE. There, however, "the processes at hillslope scale are strongly conceptualized, where rapresentation of the physics is minimised, and the resolution in space and time is maximized, and the focus is upon predicting emergent behaviors rather than system details" [Lanni, 2012].  Other models of reference are Topkapi, and those used in the Distributed Model Intercomparison Project (DMIP) [Reed et al. 2004]. I do not know how the Sacramento Model really performs, since I never used it. But I my inclination is for other models of more modern conception.^3 

Q3.  In relation to Q2 do you think that the theory by Reggiani et al., relative to the REW  concept  (Representative Elementary Watershed) is mature enough to be used outside Academy ?

A3.  Obviously Paolo would say yes, it is mature, but you have to read his literature to understand if. One limit I found in Paolo, Siva and Majid original paper was that the identification of the REWs was left unresolved. I know that Paolo also deployed a model (which I do not know in detail), but the conditions of its availability are not clear. Similar to the REW idea, probably formulated in a less fancy way, is the concept of Hydrologic Response Units (HRUs), of which you can find references in the JGrass-NewAGE paper, and has many implementations in the work, for instance,  of Peter Krause ad Daniel Viviroli. Also Roger Moussa'a MITHAS, whilst in a different context, follows the same idea.

In any case you have to decide which is the time step at which you want your response. I was assuming that you were interested in relatively small catchments, and therefore is mandatory to have hourly, or sub-hourly discharges. If you are interested in a more aggregate time response, i.e. daily discharges, other models could work. Personally I have prejudices against this kind of "physical" models, in the sense that, I believe, the physics of flood generation, in small catchment, works at smaller time scales than the day. However many models, SWAT is one,  seem to work^4. 

4. In the case, I will decide to use semi-distributed models, which method of IUH and infiltration would you use ? 

A4 - Using the IUH is even a different game. See the reference here, for instance. IUH, or GIUH are models of flow peaks, where many assumptions are made, and granted for valid.  However, the theory, as you noticed, left out the determination of the infiltration. We have a  model, called Peakflow, of the IUH, and there, we use a provocative Dunnian saturation excess scheme for generating runoff. Indeed in Peakflow one can theoretically use the methods s/he wants (and you remain with the problem to choose one), even classical bucket type models. But, so far, we did not implemented it. Some friends use SCS but  they take the curve numbers out of a calibration process, and not from the Tables of the original system. Others use a bucket model (they call Green-Ampt if they regulate the filling of the bucket with the Green-Ampt scheme). These last methods can be directly applied to cut the rainfall, and to produce an effective precipitation.  In any case, I would prefere a continuous method for your modeling, and use GIUH or IUH for controlling the results. A definitive guide on GIUH is this paper.

Q5. A model based on the topographic index or the TOPMODEL assumptions would be indicated for this type of analyses ? 

A5 - The topographic index scheme is a runoff production scheme^5 more that a rainfall-runoff model. It comes with its own limits. It was early turned into a rainfall runoff model [e.g. Beven, 2000, Franchini et al., 1996] but the core theory does not deal with flood wave propagation and aggregation (the GIUH does). Indubitably, it works^4 for producing the volume of the rainfall-runoff, and I, myself, used the topographic index to assess the initial wetness condition in a catchment in the Peakflow paper (and model) and in D'Odorico and Rigon [2003]. The reason it works is that, for small basins, as we argue in some of our papers, residence time of water in channels is negligible compared to residence time in hillslope. Therefore, when you account properly for the timing of the runoff production, and since the production of discharges is an aggregation process in which, at the end, just the total of "effective volume" counts, you have treated most of the information you need to accomplish the task of predicting discharges.  In any case, to be really usable, the topographic index, needs to be integrated with other features. In the past some did it but I would not push it further. I would use, at least, a dynamic topographic index, as suggested by Barling Moore and Grayson [1994] and improved by many others.

Finally, do not forget that where mountains area are, snow accumulation (as source of abstraction of precipitation) and snow melting (as source of discharge) are important issue to resolve. This complicates the scenery, and makes the many models that do not account for snow, not really usable.

If you read this post, you are probably interested also in the Reservoirology one.


^1 - That of weather generators is, indeed an interesting topic, that I will cover sooner or later. 

^2 - Other issues regards, the calibration of GEOtop. Even if it is a distributed model, parameters in equation, are certainly effective (e.g. Beven, 1989), and therefore a certain calibration is needed for it to properly reproduce fluxes. Calibration is a time consuming process, which could also be overwhelming in a context of distributed models like GEOtop. In fact, calibration issues experts, often use in their paper usually very unrealistic models, which cannot be taken seriously by those who wants to be use hydrological model operationally. 

^3 - Any of my colleagues has his model, not a proof of maturity of our community, indeed: but many of them work fairly well, few of them are really available, and even less usable at operational level. So in the World, all use HEC-HMS. What we are trying to do with our involvement with OMS3 is changing this situation (e.g Post on : Going beyond the present state-of-art, Reproducible Research).

^4 Klemes in his "Dilettantism in Hydrology: Transition or Destiny", argued that: "MODELS THAT WORK WELL are THE GREATEST DANGER TO PROGRESS IN HYDROLOGY. For a good mathematical model it is not enough to work well. It must work well for the right reasons ..."

^5 Or, if you see it from the soil point of view, is a subsurface flow model.


  1. C. S. who posed the question wrote to me again, and I translate and post:


    thank you so much for the answers. We will finish our studies in June (first results are encouraging), and I will tell you which decisions we made (we used a semi-distributed model) etc.

    Doing this work, I tried to rewrite a program I did during my studies on hydrology. Right now, I am solving the hydrodynamics (using the kinematic and/or parabolic approximation of the flood wave, by linearizing the de Saint-Venant equations). As soon as I will find a few time, I will dedicate to the problem of infiltration.

    In the meantime I am sending some consideration that derives from my personal experiences with programming, simulating and the analysis of the physical phenomena under study.

    1. I saw many using Excel for doing complicate things. I analyzed spreadsheets so obscure that was impossible to decrypt the, If someone does not want to program, I would use R, OCTAVE/SCILAB, and GNUPLOT. Programming in Python would be another option generating easily reusable codes (for minor problems).

    2. Writing programs to analyse physical phenomena requires a lot of time. For this reason, when you do it, it is necessary that one writes understandable (commented ?) and reusable codes: I saw allegedly professional codes, using just global variables and obsolete programming techniques. If one wants to write "useful" software (which could be used for more than one time) the effort for writing readable code is worthwhile. Nowadays, all the programming languages allow for writing good code. Even in FORTRAN is possible to write elegant, reusable, object oriented code!

    3. Lately I spent a lot of time looking for information in internet, mainly about infiltration. I arrived to the conclusion that, if one put in the budget his time, buying a book can save money !

    4. Among the other things, I rad a lot of literature, and specifically about REW. My conclusion is that the theory is very elegant, but quite complicate to apply to real cases, especially with the data we usually have. My doubt i: how can we define a REW , how can we control that fluxes among the REW are reasonable, and extensible to other cases ? ….

    5. According to me, the phrase "For any complex problem, there is always a simple solution. Which is wrong", is perfectly adapt to the Curve Numer method.

    6. IMHO, with good data and a little of practice is actually not so difficult to reproduce the rainfall-runoff behaviors. However, credibility is given to the model, just from the ability to forecast events that were not included in the observations from which the model was built.

    7. For the above reason it is always necessary to deeply know the hypotheses, explicit or implicit that a model makes, and always ask themselves if we use the more adequate tool. Avoiding to fall in the situation described by: "If you only have a hammer, you tend to see every problem as a nail".

    For this reason, when I start to solve a new practical problem, I always try to document myself and exchange opinion with more expert people. Thank you for answering, I can tell you that your answer were very useful.

    Cordiali saluti,

    C. S

    1. Dear C.S., let me add a few comment on you reflections:

      1 - Writing the program by themselves is a useful exercise. But sometime it would be better to choose programs already made .... but enough flexible to be used for our specific tasks. For this reasons we are doing a huge effort with the JGrass-NewAGE (see the relative posts) to provide not models but "components" to the end user.

      2 - Commenting code is necessary and could be also a nice thing to do. Literate programming ( can be a pleasure

      3 - I really love the example of the hammer and the nail: how many cases I know that fall in this behavior. Even, sometimes, the best science.