AboutHydrology: Meledrio, or a simple reflection on Hydrological modelling - Part VI

Tuesday, October 31, 2017

Meledrio, or a simple reflection on Hydrological modelling - Part VI - A little about calibration

The normal calibration strategy is to split the data we want to reproduce into two setz:

one for the calibration phase
one for the "validation" phase

Let's assume that we have an automatic calibrator. It usually:

generates a set of model's parameters,
estimates with the rainfall-runoff hydrological model and any given set of parameters the discharges,
compares what computed with what is measured by using a goodness of fit indicator
keeps the set of parameter that gives the best performances
repeats the operation a huge number of times (and use some heuristics for searching the best set overall)

This set of parameters is the one used for "forecasting" and

is now used against the validation set to check its performances.

However, my experience (with my students who usually perform it) is that the best parameter set in the calibration procedure, is not usually the best in validation procedure. So I suggest, at least as a trial and for further investigations to:

separate the initial data set into 3 parts (one for first calibration, one for selection, and one for validation).
Among the 1% (or x% where x is let at your decision) of best performing in the calibration phase is selected (called the behavioural set). Then 1% (one over 10^4) best performing in the selection phase is further sieved.
This 1 per ten thousand is chosen to be used in the validation phase

The hypothesis to test is that this three steps way to calibrate returns usually better performances in validation than the original two step steps one.

1 comment:

wuletawuNovember 1, 2017 at 8:10 AM
I have a felling that this problem is more of a problem for semi-distributed models, but useful for test by comparing fully-distributed and semi-distributed models. In semi-distributed models, important variables such as vegetation and land cover may not be included as time-series data, but they does change significantly in few years (e.g. in 10 years) resulting significant impact on the hydrological processes (on evapotranspiration and vegetation). So obviously, the parameters used in before 10 years can not be the same after 10 years.
ReplyDelete
Replies

Add comment