Tuesday, August 30, 2016

Difference between a blog post and a scientific paper

I get it from my student Aaron Iemma. I think it summarises many things, and it is worth reading it.
Actually when I read it for the first time, I was not paying attention to the title of the diagram but at the type of scientist it is depicting. I am not of that type, I and I would modify the first three points of the diagram. In fact  it conveys a restricted way to do science (and, in a sense, it is not completely correct). It seems here that science is about collecting data, making some statistics on them, writing a paper. The Hypothetical-Deductive paradigm is completely forgotten, as well as the idea of models, the Galilean idea that all is written in the mathematical language (but I would be less demanding, by asking at least to have a model formalised in some unambiguous statement in normal language).  So the graphics above is the example of how data science works, and it is certainly true, when it emphasises the many iteration and the sieve any parcel of scientific knowledge has to pass trough. I however, do not collect data. I do mathematical models of reality, and use data to  validate or disprove them. My models require data but viceversa, data collection requires to have a model in mind, having defined quantities to be measured, and having implemented instruments to measure what desired. That is intrinsically slow and strenuous but, at the end, it is the most effective way to proceed. Blog procedure is easy but with not rewards and with no effective results. This is a blog however 😉 but this blog, from another point of view, provocatively asserts that there are at least ten reasons why a blog post is better than a paper.
=

Thursday, August 25, 2016

Who is a good programmer ?

I took it verbatim from Cristopher Burke.
  • Programmer: anyone who can write working programs to solve problems, given a sufficiently detailed problem statement. I have no use for programmers. 
  • Good programmer: a programmer who collaborates with others to create maintainable, elegant programs suitable for use by the customer, on time and with low defect rates, with little or no interpersonal drama. I can never get enough good programmers. 
  • Great programmer: a good programmer who understands algorithms and architectures intuitively, can build self-consistent large systems with little supervision, can invent new algorithms, can refactor live systems without breaking them, can communicate effectively and cogently with non-technical staff on technical and non-technical issues, understands how to keep his or her ego in check, and can teach his or her skills to others. I need a few great programmers on a team, but they're overkill for many programming tasks. 
The path of becoming a great programmer is to start by being a programmer, then develop the skills needed to be a good programmer, then practice those skills until you master them, then develop the skills needed to be a great programmer, then practice those skills until you master them.

The amount of time this takes depends on your personal skills, personality, and training. It also depends on the experience and opportunities that you have during your career, and how you react to them.

I would add a further category:
  • Outstanding programmer: a great programmer who also discovers new way of using programming languages, finds new algorithms, new patterns, push forward program's design, and/or with programming performs new tasks and solve new problems in a specific domain. There is the need of someone of them who give a reasons for programming.
Finally some opinions on programming that Cosma Shalizi shares with his students of "Introduction to Statistical Computing". Let's say that should be use to become a programmer. Great the first statement:

"Programming is expression: take a personal, private, intuitive, irreproducible series of acts of thought, and make it public, objective, shared, explicit, repeatable, improvable. This resembles both writing and building a machine: communicative like writing, but with the impersonal, it all-must-fit check of the machine. All the other principles follow from this fact, that it is turning an act of individual thought into a shared artifact — reducing intelligence to intellect."

Friday, August 19, 2016

Estimating the water budget components and their variability in a Pre-Alpine basin with JGrass-NewAGE

This paper shows the estimation of whole water budget in a 100 hundred square kilometers catchment of the Prealps, the Posina catchment.  The challenge here is to close the water budget with a coarse-grained model. The basin has alla the complexeties such a basin can have, and snow has to be taken into accounts.
Input data available are discharges measured in several gauges and hydrometeorological data measured in twelve meteo stations. However, evapotranspiration is not explicitly measured, neither are the water table levels.

The paper starts with treatign carefully the water inputs, and, chosing Kriging as a method for interpolation, detailed how to do it with the new tools offered by Jgrass-NewAGE (all the posts here).  It also discusses the separation of precipitation in snow and rainfall, and illustrates a way to do it by using MODIS data. Closing the water budget further requires some trick for separating water storage in soils and evapotranspiration. The paper devises a method for doing it, in assuming that the water storage is nullified after an appropriate set of years, called Budyko's time. The final outcomes is a reasonable estimate of any of teh components of the hydrological cycle at various time scale, and spatially distributed according to a partition of the catchments in hydrologic response units (HRU), as those depicted in the figure above. In any case, you can have the details of the story by reading the paper itself, by clicking on the figure or here.
Information related to this paper can also be found in the post dedicated to Abera's Ph.D. defense.
The preprint has been submitted to Advances in Water Resources.

Update

The paper was revised. Answer to reviewers, complimentary material and the new paper itself, can be found on Zenodo.

Update 2 - The paper was accepted for publication in Advances in Water Resources. You can find it here.

Thursday, August 18, 2016

Reservoirology

The current idea in many modern modelling systems (at catchment scale) is that the hydrology of control volumes can be reduced to a set of interconnected reservoirs (Fenicia et al., 2008; Tague et al., 2010; Clark et al., 2008; Hrachowitz et al., 2013). Each one of these reservoir can be thought as “well mixed”, meaning that each water reservoir acts like a chemical reactor where all what comes in is uniformly distributed across the reservoir and perfectly mixes (and instantaneously too) with everything already present. Some of these reservoirs have a geographical identification (as happened in the geomorphologic unit hydrograph, eg. Rinaldo and Rodriguez-Iturbe, 1996), some others have a functional reason and we could call them “embedded”, or with no-geographical reference. They are used especially to disentangle processes, to attribute the right travel time to water, and, more prosaically, to get right quantitative adjustements to the various outputs, discharge, first of all. 

This way of working is quite necessary (but remind Todini’s adage: first they had a linear reservoir -Sherman, 1932 -, then they introduced a sequence of linear reservoirs - Nash, 1957, after, they made the reservoirs non linear, Dooge, 1959; finally was a mess, Sagawara, 1967) but the separation (or the composition, it depends from the point of view) of the domain in parts is somewhat that remains to be demonstrated.

In using these simplifications  nobody of the researchers mentioned goes through a direct simplification of coupled (hydrological) equations at the finer scale and coarse grain them^1, but directly adopts the paradigm of reservoirs and validate it, nowadays technologically, by using a set of GOFs (goodness of fit) indicators. 

This system is, IMO, intrinsically weak but I, like the others, adopted it. A statistical theory is missing, and I believe that it will arise from the travel time theories

In the revamp of  these reservoirs' theories, the attention to the spatial distribution of the reservoirs should find again its place. Distinguishing, “vertically”, reservoirs as canopy, surface, vadose zone and groundwater storage is a necessity that is widely recognised and should be deployed. Reservoirs “lateral” aggregation (by using convolutions or following new paradigms) is the follow up (see also Rigon et al., 2016a). Embedded chains of reservoirs should be also necessary, as invoked by those who study small catchments experimentally (e.g. Birkel, 2011). All should be investigated carefully, and complexity added after implementing “ad hoc” experiments, or using appropriate datasets. 

This scenery would not be complete if these models would limit themselves to consider just the forecasting of discharges. They should also convey a reasonable set of processes to close the water budget. Next would be to include appropriate simplifications of the energy budget, a necessary companion. The latter, however, was never tried so far in coarse grained models.

Notes

^1 Perhaps Paolo Reggiani et al, 1998 tried it in a generic way, making experience on Gray’s previous work, and Todini did it, his own way with Topkapy, i.e. Liu and Todini, 2002; see also Todini, 2007 and my talk here. Paolo's work is certainly to be reconsidered.

References

Birkel, C., Soulsby, C., & Tetzlaff, D. (2014). Developing a consistent process-based conceptualization of catchment functioning using measurements of internal state variables. Water Resources Research, 50(4), 3481–3501. http://doi.org/10.1002/2013WR014925

Clark, M. P., A. G. Slater, D. E. Rupp, R. A. Woods, J. A. Vrugt, H. V. Gupta, T. Wagener, and L. E. Hay (2008), Framework for Understanding Structural Errors (FUSE): A modular framework to diagnose differences between hydrological models, Water Resour. Res., 44, W00B02, doi:10.1029/2007WR006735.

Dooge J,  (1959) - This reference was suggested by Ezio Todini, but I did not find it (the one talking of non-linear reservoirs).

Fenicia F, Savenije HHG, Matgen P, Pfister L, 2008. Understanding catchment behavior through stepwise model concept improvement. Water Resour. Res. 44(1): W01402. ISSN 0043-1397. doi:10.1029/2006WR005563. 

Gray, W., Lijennse, A, Kolar, R.L, Blain, C.A., Mathematical tools for changing spatial scales in the analysis of physical systems, CRC Press, Boca Raton, 1994

Hrachowitz, M., Savenije, H., Bogaard, T. A., Tetzlaff, D., & Soulsby, C. (2013). What can flux tracking teach us about water age distribution patterns and their temporal dynamics? Hydrology and Earth System Sciences, 17(2), 533–564. http://doi.org/10.5194/hess-17-533-2013

Liu and Todini (2002), Towards a comprehensive physically-based rainfall-runoff model, Hydrology and Earth System Sciences, 6(5), 859–881

Nash, J.E., 1958, The form of the instantaneous unit hydrograph, IUGG General Assembly of Toronto, Vol III, IAHS pub. no. 45, 1141-121. 

Reggiani, P., M. Sivapalan, and S. M. Hassanizadeh (1998), A unifying

Rinaldo A & Rodríguez-Iturbe I, 1996. Geomorphological theory of the hydrological response. Hydrol. Process. 10(6): 803–829. ISSN 1099-1085. doi:10.1002/(SICI)1099- 1085(199606)10:6<803::AID-HYP373>3.0.CO;2-N. 

Rigon R., Bancheri M., Formetta G., & de Lavenne, A. (2015). The geomorphological unit hydrograph from a historical-critical perspective. Earth Surface Processes and Landforms, n/a–n/a. http://doi.org/10.1002/esp.3855

Rigon R., Bancheri M, Green T., Age-ranked hydrological budgets and a travel time description of catchment hydrology,Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-210, in review, 2016.

Sherman, L. K., Streamflow from rainfall by the unit hydrograph method, Eng. News-Record 108, 501-505, 1932. 

Sugawara, 1967, The flood forecasting by a seried storage type model, IAHS Publication no. 85, 1-6

Tague, C., & Dugger, A. L. (2010). Ecohydrology and Climate Change in the Mountains of the Western USA - A Review of Research and Opportunities. Geography Compass, 4(11), 1648–1663. http://doi.org/10.1111/j.1749-8198.2010.00400.x


Todini, E. (2007). Hydrological catchment modelling: past, present and future. Hess, 11(1), 468–482.

Sunday, August 7, 2016

Sparse thoughts (on the foundations of a Thermodynamics of Hydrological Systems)

Assume you have a control section in which you include a natural landscape. Fluxes of energy and mass across the boundary of the control volume are more or less understood. What is not understood is the arrangement these fluxes can have inside the control volume, and how to describe their complexity (would be entropy their measure ?).

Make this conceptual experiment. Think that inside the volume there is only a bare soil (it is in itself an extraordinary simplification to think that soil is simple), and think another system where you have grass growing on the soil.   Make it simpler ! Just have a substance inside the control volume, and air (which is still very complicate because of turbulence).

The problem at this level is primarily descriptive. How to describe the content of the control volume ?

According to physics, we have (1) to define the way the internal stuff interacts with the incoming fluxes (of mass, momentum and energy) (2) understanding or making some assumption on the stability and form of what is inside.

Stability has certainly to do with the phase of the system, which assumes that the matter has a certain organisation. We have a weak knowledge of it. We are used to specify the organisation  as phases of matter, but how in physics (chemistry ?, biology ?)  can be described complex patterns of organisation is not clear.

Without any doubt, we have to add the complex concepts of temperature of the system, its pressure, its volume, its chemical potential and how to fluxes reorganise what is inside.
It is also certainly true that form counts and we have to distinguish  surfaces of separation between substances.  Even if we are talking of inhanimated substances, so far. 

But, if we intoduce or growing systems (that can be still inhanimated), this become more and more complicate, if we introduce self-reproducing systems (life ?) the path of knowledge becomes even more steep and challenging.

Is this way of thinking reductionism ? It could be, but it is not. 

Antireductionists claim that a system is more than its parts, formally this is like to say that 1+1 \neq 2 and seems a nonsense. For me this means that  a system is its parts plus the interactions that the parts can establish between themselves and the outer boundaries. 

That's kind of elementary mathematics that connections between points grow more than exponentially and the number of interactions grow equally likely, such that a small brain has as many connections comparable to the whole number of protons in the known universe.

So one quality of the system, besides the matter it is composed of, is given by the number of interactions that it is composed on.

Even if we limit our arguments to inanimate matter, these interactions are usually not computable, and this make the system unknowable, and destroys, in my view, the naive idea that I am a reductionist.

Anyway, out of philosophy, and parachuting us back to earth, this needs that we need something new to describe interactions among parts. Being a hydrologist, I would be interested in knowing how to approach organisation in hydrological system.

Further reading: this post on Information theory, this post on Thermodynamics, this post on "What is life" (and my professorship talk), graph theories. Ruddel's papers. Kleindon book. Azimuth's posts.



Wednesday, August 3, 2016

Is Navier-Stokes equation enough ?

This is just a place holder at the moment, to be better developed in future. The question is to generalize the Navier-Stokes (NS) equation in presence of gradient of temperature and the triggering of convection, an topic I am interested in for understanding how evapotraspiration works.

The standard view of the phenomenon is that NS is enough, and it is sufficient to couple it with the energy budget. An equation of state connecting density and temperature is necessary to close the set of equations. 

This understanding correspond to the perceptual model that air is heated "slowly" without creating motion, and subsequently differences in density trigger the convection. 

However, for what I know, it could not be like this. According to Onsager theories, there should be air movements triggered directly by temperature gradients.  Mass fluxes, I mean. Therefore, which is the dominant process at the beginning of the convection (i.e. the density gradient or the the temperature gradient) could be a question of scale. I think it could be an interesting issue to investigate, theoretically, numerically, and experimentally.  If some reader knows literature on the subject, s/he can freely put a comment.