Tuesday, February 26, 2019

Recent advances in big data machine learning in Hydrology

Chaopeng Shen (GS) of Penn State is organizing a series of Cyberseminars for CUASHI about Machine Learning in Hydrology.

Recently big data machine learning has led to substantial changes across many areas of study. In Hydrology, the introduction of big data and machine learning methods have substantially improved our ability to address existing challenges and encouraged novel perspectives and new applications. These advances present new opportunities methods that aid scientific discovery, data discovery, and predictive modeling. This series cover new techniques and findings that have emerged in Hydrology during the previous year, with a focus on catchment and land surface hydrology.

The announcement on the CUASHI site is here.

All talks take place on Fridays at 1:00 p.m. ET: Registration is free! You must register for the series in order to attend. To register, click here.

This is the foreseen schedule:
  • March 29, 2019: Machine Learning & Information Theory for Land Model Benchmarking & Process Diagnostics | Grey Nearing, University of Alabama
  • April 5, 2019: Long Short-Term Memory (LSTM) networks for rainfall-runoff modeling | Frederik Kratzert, Johannes Kepler University
  • April 12, 2019: Use deep convolutional neural nets to learn patterns of mismatch between a land surface model and GRACE satellite | Alex Sun, University of Texas at Austin
  • April 19, 2019: Long-term projections of soil moisture using deep learning and SMAP data with aleatoric and epistemic uncertainty estimates | Chaopeng Shen, Pennsylvania State University
  • April 26, 2019: Exploring deep neural networks to retrieve rain and snow in high latitudes using multi-sensor and reanalysis data | Guoqiang Tang, Tsinghua University
  • May 3, 2019: Multioutput neural networks for estimating flow-duration curves in ungaged catchments | Scott Worland, Cornell University and USGS
  • May 10, 2019: Remote sensing precipitation using artificial neural networks and machine learning methods | Kuolin Hsu, University of California, Irvine

Thursday, February 21, 2019

Warredoc Winter School - Hydrology on data rich Hydrology

Last month, Fernando Nardi of the Università per stranieri (University for Foreigners) in Perugia and Warredoc organised a nice and successful Winter School on Data Rich Hydrology.
Many colleagues participated and gave very nice presentations. Here below I am reproducing verbatim the content of the School's pages with links to the pdf of the lectures.


Lunedì 28 Gennaio

Rafael L. Bras, The Era of Data Rich Hydrology
Stefan Uhlenbrook, The WWDR and SDG 6 Synthesis Report

Sessione pomeridiana
Aldo Fiori, Groundwater hydrology and hydrological process mechanics
Marco Marani, Beyond traditional extreme value theory: lessons learned from rainfall and hurricane intensity
Maria Cristina Rulli, The water-food-energy nexus

Martedì 29 Gennaio
Sessione mattutina
Rafael L. Bras, The Era of Data Rich Hydrology
Stefan Uhlenbrook, The WWDR and SDG 6 Synthesis Report
Fabio Castelli, Remote sensing and data assimilation in hydrology

Sessione pomeridiana
Roberto Deidda, Modelling scaling properties of precipitation fields
Salvatore Grimaldi, Hydrologic measurements and novel observation technologies
— Dinner & Social event —

Mercoledì 30 Gennaio
Sessione mattutina
Salvatore Manfreda, Drones in Hydrology (lecture & hands on)
Elena Volpi, Hydrological risk assessment: Return period and probability of failure

Sessione pomeridiana
Andrea Libertino, Advances in the space-time analysis of rainfall extremes
Riccardo Rigon, Hydrologic modelling in a data rich world

Giovedì 31 Gennaio
Sessione mattutina
Daniele Ganora, Data poor vs. data rich cases for flood hazard (lecture & hands on)
Gabriele Freni, Distributed Data quality and urban flood modelling uncertainty

Sessione pomeridiana
Fernando Nardi, Citizen science and big data in hydrology

Venerdì 1 Febbraio
Sessione mattutina
Tommaso Moramarco, Stream flow measurements: ground and satellite observations
Alessio Domeneghetti, Remote sensing data and tools to foster inland water monitoring and flood modeling

Tuesday, February 19, 2019

Ph.D. Miscellanea - Jupyter Notebook with R or Python on Statistics and Hydrology

This blog post is to share some of the notebooks provided by my Ph.D. students on the topics they follow in their Ph.D. classes. Please observe that some of the material are lecture notes by some of my colleagues. You can use them but you should cite the source when you do it.

Sunday, February 10, 2019

My Hydraulic Construction Class 2019

The new class  is obviously based on the previous classes:
There are some changes however, partially due to the different calendar. The course is essentially divided in three parts:
  • Rainfall analysis and statistics
  • How to design a storm water management system (SWMS)
  • How to design an aqueduct


Rainfall analysis and statistics is essential to the design of the storm water system and requires some use of Python in Jupyterlab. Design of the SWMS requires the use of some Python, QGIS 2.18, GISWATER and SWMM softwares. The aqueducts require the use of QGIS and its plugin QEPAnet that implements the tool for estimating pressure water called EPANET.  All these tools are open source.

More specifically:
  • Python - Python is a modern programming languages. It will be used for data treatment, estimation of the idf curves of precipitation, some hydraulic calculation and data visualisation. I will use Python mostly as a scripting language to bind and using existing tools. 
  • QGIS is a Geographic Information System. GIS are an essential tool for who works on landscape or planning infrastructures extended on the territory. 
  • SWMM - Is an acronym for Storm Water Management System. Essentially it is a model for the estimation of runoff adjusted to Urban environment. I do not endorse very much its hydrology. However, it is the most used tools by colleagues who cares about storm water management, and I adopt it. It is not a tool for designing storm water networks, and therefore, some more work should be done with Python to fill the gaps.
  • EPANET Is the tool developed by EPA to estimate water distribution networks. 
Installation Instructions (for Windows) by Daniele Della Torre:

SWMM: http://growworkinghard.altervista.org/epa-swmm-how-to-install-step-by-step/

GISWATER: http://growworkinghard.altervista.org/giswater-11-install-windows/

QGIS: http://growworkinghard.altervista.org/qgis-2-18-how-to-install-step-by-step-on-windows/

As you can infer from the previous lines, the class needs to learn some hydrology, some hydraulics and the use of various softwares. As I try to explain in the Syllabus lesson, the first day, there is no space for exploiting all the possibilities implied by the software, nor even to go very deep in the theory of hydrological processes and even in the design of the systems. The student has to become comfortable with the idea that they (singular they) is going to get an introduction to all these topics and they will need further studies to use professionally the received information.  

Foreseen Schedule

T -  stands for a mostly theoretical class
L -  stands for a class in the lab

Precipitation analysis and statistics (and an intro to Python scripting)

2019-02-25 -


Storm Water Management System Design

2019-03-25 -  T - Introduction to Storm Water Management System Design
2019-03-25 - L - Working with GISWATER
2019-04-01 T - Elements for the design of stormwaters management systems -I
2019-04-04 - L - Using QGIS and GIS Water for designing the Storm Water Management System

2019-04-08 T - for the design of stormwaters management systems - II
2019-04-15 (Room 2F - 9 a.m.) Intermediate test (Questions pool).

2019-04-30 Grades of Intermediate test

2019- 05 -02

2019-05-02 - OSF for Italians


As a general, simple and descriptive reference, the first six chapters of Maurizio Leopardi's book can be useful :
The state of the water supply in Italy is summarised here (Corriere della Sera, 2018-05-16)

Some Examples of presentations on the projects of this class:
Other interesting examples of presentations (from the course "Progettazione di acquedotti e fognature):

My Hydrology Class 2019

To have an idea about this class, please look at the Syllabus slides below.  This year the class will be a little different from the last year: there will be more hydrology and less statistics.  Laboratory work will be (mostly) concentrated in May and June. March and April will be mostly spent to develop the theoretical parts.
Lectures  and lab classes will be recorded and uploaded on my YouTube channel.
The intermediate exam will be written with 3 questions about the topic treated to which the student will be asked to answer with text, figures and formulas. The final exam will be a discussion of the exercises provided by the students int the form of Jupyter notebooks plus a short Python exercise. Each of the exercises will be discussed separately and by booking an appointment with the professor before the formal date of the exam or at the day of the final exam. The Pico touch-screen at the first floor of the Mesiano building will be used for the presentations.

Used Software

There is no engineering without using models. During the class will be used various open source softwares and resources:
All these resources are free, besides being open. For installations requirements, please see the GEOframe winter school material here. For understanding a little more about this material, please look at "Getting started with Docker OMS and Jupyterlab" post.

Lab material can be found here

Foreseen Schedule

Material uploaded is subject to modifications prior to the schedule date


2019-03-06  - Ground based Precipitations and their statistics Separation snow-rainfall - measure of precipitation
2019-03-08  Extreme precipitations. 
Determination of Gumbel's parameters
T -Water in soil. Darcy-Buckingham. Hydraulic conductivity. Soil water retention curves.
 Richards equation and its extensions 
2019 - 03- 22
2019-04-05 - Evaporation generalities
2019-04-17 - Intermediate test (Room 2F, 9 a.m.)  Questions

2019-05-02 - Grades of Intermediate Test

2019-05-03  - Evaporation from soils and Transpiration

Lab material can be found here

Wednesday, February 6, 2019

On dimensional analysis (from Terry Tao's blog)

Terence Tao is a Fields' medalist. " Timothy Gowers remarked on Tao's accomplishments:[12]

Tao's mathematical knowledge has an extraordinary combination of breadth and depth: he can write confidently and authoritatively on topics as diverse as partial differential equations, analytic number theory, the geometry of 3-manifolds, nonstandard analysis, group theory, model theory, quantum mechanics, probability, ergodic theory, combinatorics, harmonic analysis, image processing, functional analysis, and many others. Some of these are areas to which he has made fundamental contributions. Others are areas that he appears to understand at the deep intuitive level of an expert despite officially not working in those areas. How he does all this, as well as writing papers and books at a prodigious rate, is a complete mystery. It has been said that Hilbert was the last person to know all of mathematics, but it is not easy to find gaps in Tao's knowledge, and if you do then you may well find that the gaps have been filled a year later."  [Source Wikipedia]

Therefore I was happy to find that he wrote on his blog dimensional analysis (and, implicitly about  the Pi Theorem). 

Here below I am getting verbatim the introduction of his blogpost:

"Mathematicians study a variety of different mathematical structures, but perhaps the structures that are most commonly associated with mathematics are the number systems, such as the integers or the real numbers . Indeed, the use of number systems is so closely identified with the practice of mathematics that one sometimes forgets that it is possible to do mathematics without explicit reference to any concept of number. For instance, the ancient Greeks were able to prove many theorems in Euclidean geometry, well before the development of Cartesian coordinates and analytic geometry in the seventeenth century, or the formal constructions or axiomatisations of the real number system that emerged in the nineteenth century (not to mention precursor concepts such as zero or negative numbers, whose very existence was highly controversial, if entertained at all, to the ancient Greeks). To do this, the Greeks used geometric operations as substitutes for the arithmetic operations that would be more familiar to modern mathematicians. For instance, concatenation of line segments or planar regions serves as a substitute for addition; the operation of forming a rectangle out of two line segments would serve as a substitute for multiplication; the concept of similarity can be used as a substitute for ratios or division; and so forth.

A similar situation exists in modern physics. Physical quantities such as length, mass, momentum, charge, and so forth are routinely measured and manipulated using the real number system (or related systems, such as if one wishes to measure a vector-valued physical quantity such as velocity). Much as analytic geometry allows one to use the laws of algebra and trigonometry to calculate and prove theorems in geometry, the identification of physical quantities with numbers allows one to express physical laws and relationships (such as Einstein’s famous mass-energy equivalence ) as algebraic (or differential) equations, which can then be solved and otherwise manipulated through the extensive mathematical toolbox that has been developed over the centuries to deal with such equations.

However, as any student of physics is aware, most physical quantities are not represented purely by one or more numbers, but instead by a combination of a number and some sort of unit. For instance, it would be a category error to assert that the length of some object was a number such as ; instead, one has to say something like “the length of this object is yards”, combining both a number and a unit (in this case, the yard). Changing the unit leads to a change in the numerical value assigned to this physical quantity, even though no physical change to the object being measured has occurred. For instance, if one decides to use feet as the unit of length instead of yards, then the length of the object is now feet; if one instead uses metres, the length is now metres; and so forth. But nothing physical has changed when performing this change of units, and these lengths are considered all equal to each other:

It is then common to declare that while physical quantities and units are not, strictly speaking, numbers, they should be manipulated using the laws of algebra as if they were numerical quantities. For instance, if an object travels metres in seconds, then its speed should be

where we use the usual abbreviations of and for metres and seconds respectively. Similarly, if the speed of light is and an object has mass , then Einstein’s mass-energy equivalence then tells us that the energy-content of this object is

Note that the symbols are being manipulated algebraically as if they were mathematical variables such as and . By collecting all these units together, we see that every physical quantity gets assigned a unit of a certain dimension: for instance, we see here that the energy of an object can be given the unit of (more commonly known as a Joule), which has the dimension of where are the dimensions of mass, length, and time respectively.

There is however one important limitation to the ability to manipulate “dimensionful” quantities as if they were numbers: one is not supposed to add, subtract, or compare two physical quantities if they have different dimensions, although it is acceptable to multiply or divide two such quantities. For instance, if is a mass (having the units ) and is a speed (having the units ), then it is physically “legitimate” to form an expression such as , but not an expression such as or ; in a similar spirit, statements such as or are physically meaningless. This combines well with the mathematical distinction between vector, scalar, and matrix quantities, which among other things prohibits one from adding together two such quantities if their vector or matrix type are different (e.g. one cannot add a scalar to a vector, or a vector to a matrix), and also places limitations on when two such quantities can be multiplied together. A related limitation, which is not always made explicit in physics texts, is that transcendental mathematical functions such as or should only be applied to arguments that are dimensionless; thus, for instance, if is a speed, then is not physically meaningful, but is (this particular quantity is known as the rapidity associated to this speed).

These limitations may seem like a weakness in the mathematical modeling of physical quantities; one may think that one could get a more “powerful” mathematical framework if one were allowed to perform dimensionally inconsistent operations, such as add together a mass and a velocity, add together a vector and a scalar, exponentiate a length, etc. Certainly there is some precedent for this in mathematics; for instance, the formalism of Clifford algebras does in fact allow one to (among other things) add vectors with scalars, and in differential geometry it is quite common to formally apply transcendental functions (such as the exponential function) to a differential form (for instance, the Liouville measure of a symplectic manifold can be usefully thought of as a component of the exponential of the symplectic form ).

However, there are several reasons why it is advantageous to retain the limitation to only perform dimensionally consistent operations. One is that of error correction: one can often catch (and correct for) errors in one’s calculations by discovering a dimensional inconsistency, and tracing it back to the first step where it occurs. Also, by performing dimensional analysis, one can often identify the form of a physical law before one has fully derived it. For instance, if one postulates the existence of a mass-energy relationship involving only the mass of an object , the energy content , and the speed of light , dimensional analysis is already sufficient to deduce that the relationship must be of the form for some dimensionless absolute constant ; the only remaining task is then to work out the constant of proportionality , which requires physical arguments beyond that provided by dimensional analysis. (This is a simple instance of a more general application of dimensional analysis known as the Buckingham theorem.) [Continue reading on Tao's blog]