Recently I posted the history of JGrass-NewAGE, and I was wondering where the old material of JGrass-NewAGE version 0 was. All was more than prototypical and quite operative (with maybe some flaws in the science) and worth to remain alive. Wasn't the use of OMS etc. looking to some reproducibility and maintainability ?
Finally I recovered in my HD the part of the documentation of that version which can be downloaded from here (unfortunately is in Italian).
The database developed in PostgreSQL/PostGIS referred in the "manual" was found stored in a computer accessible trough the password "idrologia", and we are going to see if it is still usable and extendable. Actually, as apparent from my previous post, most of the components of the project can be considered obsolete, enhanced by those available with JGrass-NewAGE version 1. However, at least one component is still missing, and I hope to be able to get it back on-line: the one integrating the de Saint-Venant Equation in 1-D. It was actually saved in a public repository (so, it should be), but the repository was changed, and not maintained anymore. So people have to look at old backups to get it back.
Lessons learned: 1 - there should be an explicit will to maintain old stuff (quite an obvious conclusion) and not to make it vanish as vapour in summer. 2 - Even if the project was produced as Open Source, if it is not maintained, or at least left in a public site which is maintained unaltered, the effort is meaningless. Useful stuff can be easily lost. 3 - Data can be lost even easier than software. A repository - Github-like for data should be mandatory. We'll see what we can do.
My reflections and notes about hydrology and being a hydrologist in academia. The daily evolution of my work. Especially for my students, but also for anyone with the patience to read them.
Monday, December 29, 2014
MeteoIO
As known, we use MeteoIO in our GEOtop 2.0. Despite the fact that we implemented most of the same capabilities inside GEOtop directly. And despite we also reimplemented the same (and in some case more articulate possibilities inside JGrass-NewAGE). Many the reasons: having alternatives to compare is good; it is not possible to keep pace in every subject necessary to built a modeling system, and having someone doing things for you is the essence of the success of a open source project; the JGrass-NewAGE system is not yet at the stage to be interoperable with GEOtop tools. In any case Mathias Bavey and Thomas Egger did an excellent work in documenting MeteoIO with this paper appeared in GMD, one of our journals of election.
The project is open source, well designed, constantly maintained and evolved, in C++. Have a nice reading.
The project is open source, well designed, constantly maintained and evolved, in C++. Have a nice reading.
Monday, December 15, 2014
Pfafstetter Numbering and the organisation of river networks
We just have accepted a paper related to the topological ordering we use inside our model JGrass-NewAGE. This ordering derives is a generalisation of the Pfafstetter algorithm.
Once understood, it can be observed that this Pfafstetter ordering can be used to drive, for instance parallel execution of operations. But this is another story. We actually use to orderly process the basin partition in JGrass-NewAGE modules. The paper, entitled "Digital watershed partition within the NewAGE-Jgrass system" by Formetta et al. can be downloaded from here.
Once understood, it can be observed that this Pfafstetter ordering can be used to drive, for instance parallel execution of operations. But this is another story. We actually use to orderly process the basin partition in JGrass-NewAGE modules. The paper, entitled "Digital watershed partition within the NewAGE-Jgrass system" by Formetta et al. can be downloaded from here.
Related topics are covered in posts of the Horton Machine, and particularly in the book Chapter written early this year for the British Gemorphological society. Abera's blog also contains a related post that can clarify some other issues.
Friday, December 12, 2014
Using geostatistics to integrate satellite information and modelling on soil moisture
This paper has a long history and explore the idea that geostatistics can be used to integrate satellite information when this is missing. At the same time the whole information is used for assimilated for better driving the Community Land Model. Thank you to Han Xujun for pursuing the publication, when I abandoned any hope, notwithstanding that the paper is a good one.
The paper is entitled: Soil Moisture Estimation by Assimilating L-Band Microwave Brightness Temperature with Geostatistics and Observation Localization, and my co-authors are (in order):
Han Xujun, Xin Li, Rui Jin, and Stefano Endrizzi. The paper has been accepted by PLOSONE, and you can find the pre-print here.
Other papers by Xujun are available from his Research Gate Profile.
A little of further bibliography:
Han, X. J., et al. (2014). "Soil moisture and soil properties estimation in the Community Land Model with synthetic brightness temperature observations." Water Resources Research 50(7): 6081-6105.
Han, X. J., et al. (2013). "Joint Assimilation of Surface Temperature and L-Band Microwave Brightness Temperature in Land Data Assimilation." Vadose Zone Journal 12(3).
Han, X., et al. (2012). "Spatial horizontal correlation characteristics in the land data assimilation of soil moisture." Hydrology and Earth System Sciences 16(5): 1349-1363.
The paper is entitled: Soil Moisture Estimation by Assimilating L-Band Microwave Brightness Temperature with Geostatistics and Observation Localization, and my co-authors are (in order):
Han Xujun, Xin Li, Rui Jin, and Stefano Endrizzi. The paper has been accepted by PLOSONE, and you can find the pre-print here.
Other papers by Xujun are available from his Research Gate Profile.
A little of further bibliography:
Han, X. J., et al. (2014). "Soil moisture and soil properties estimation in the Community Land Model with synthetic brightness temperature observations." Water Resources Research 50(7): 6081-6105.
Han, X. J., et al. (2013). "Joint Assimilation of Surface Temperature and L-Band Microwave Brightness Temperature in Land Data Assimilation." Vadose Zone Journal 12(3).
Han, X., et al. (2012). "Spatial horizontal correlation characteristics in the land data assimilation of soil moisture." Hydrology and Earth System Sciences 16(5): 1349-1363.
Saturday, December 6, 2014
Imagine to be a hydrologist and you want to learn Java
Dear * to learn Java:
you can use my lectures (still being produced): http://abouthydrology.blogspot.it/2013/07/java-for-hydrologists-101.html
Read the books I list in my blog: http://abouthydrology.blogspot.it/2012/12/a-little-java-library-for-beginners.html. Many of them are specifically dedicated to Numerics and Scientific Computation.
Learning Java is very much programming, not just reading. So You have to choose a task and try to perform it. Good experiments should be, creating reusable classes for:
- Reading and writing data to a file
- For a generic function
- For solving an Ordinary differential equation (or a set of them: Lorenz' chaotic equations would be a good exercise indeed
If you do math, sooner or later you to use matrixes. You do not need to reinvent the wheel, even if a standard choice has not been yet emerged. See here.
In general all the Java resources I came across (including this one, are at): http://abouthydrology.blogspot.it/search/label/Java
In general all the Java resources I came across (including this one, are at): http://abouthydrology.blogspot.it/search/label/Java
Francesco Serafin, master thesis, introduces to various tools and methods that can be used to integrate partial differential equations: http://abouthydrology.blogspot.it/2014/07/patterns-for-application-of-modern.html
The subsequent step is to learn OMS: maybe this can be a little more complicated. Anyway, I way to do this is to start with the 2013 summer school:
or you can go and work out the examples you can find at the original OMS site: http://abouthydrology.blogspot.it/search/label/OMS3
or my own information on the topic: http://abouthydrology.blogspot.it/2013/01/object-modelling-system-resources.html
I will be improving the resources all the time. So check them once in a while !
Tuesday, December 2, 2014
Evapotranspiration parameters in coarse grained modelling
To have a little rehearsal on Evapotranspiration look first at my post on Potential Evapotranspiration. where its estimation with Dalton equation and simplified model, like Penman-Monteith (PM) or Priestley-Taylor (PT) is covered. We concentrate here on the simplest of the formulas, the PT's one.
Once your get PT alpha_p, you can estimate pET but still you have to introduce a further reduction to get the actual evapotranpiration (aET). The method popularized by the ecohydrology literature (e.g. read Amilcare Porporato here) is to introduce a linear decrease of pET with water storage in the root zone "reservoir".
Both the passage, the determination of pET in the framework of PT and the linear reduction with storage have, in my view, strong drawbacks from the quantitative point of view.
One can get the alphap, but literature show a huge variability. So literature is quite useless to obtain quantitative results, with a decent certainty.
The (linear) decrease of ET with soil moisture requires the determination of at least one additional coefficient. In fact, it is well known that ET has two stage: stage one, when ET is "at the potential rate", independently from the water content up to a critical soil moisture, well below saturation, when ET is depressed, not by increasing suction (the so called Kelvin effect, which is a second order effect) but by the fact that pores at the soil or leaf surface to which water is supplied are more and more far apart (see recent literature by Dani Or and co-workers). This critical soil moisture, at which the second stage ET starts is a further coefficient, and its identification with saturation implies a clear underestimation of ET. It is usually given for granted by my friends ecohydrologists and my master IRI's literature that it can be determined. But I do not have to remind to you all how much elusive it is the definition of the "root zone soil moisture" just to cite a practical aspect of it.
Even if field-fellow-scientists claim to have measured it, I know that who tried in lab of few square meters really struggled to close the water budget budget under very controlled conditions (let's say: personal communications). In nature, as my hero Pete Eagleson teaches, interaction among plants distribution, atmosphere, and rugged terrain makes any of the above coefficient heterogeneous, and the trials to find a rational to all of it, kind of frustrating to my eyes.
Said all of this, let's go back to PT, and you can give a look to the presentation below to know what happens when you coarse-grain your model resolution in time and space.
References
[1] Priestley, C.H.B. and Taylor R. J., On the assessment of surface heat flux and evaporation using large scale parameters, Monthly Weather Review, Vol. 100, No 2, 81-92,1972
[2] - Rigon, R., Evapotranspiration Slides
[3] - Rigon, R., Solar Radiation Slides
[4] - Rodriguez-Iturbe, I., Porporato, A., Ridolfi, L., Isham, V., & Cox, D. (1999). Probabilistic modelling of water balance at a point: the role of climate, soil and vegetation. Prooceedings of the Royal Society, 455, 3879–3805
[5] - Rigon R., Bertoldi G e T. M. Over, GEOtop: A distributed hydrological model with coupled water and energy budgets, Vol. 7, No. 3, pages 371-388
[6] Bertoldi G. R. Rigon e T. M. Over, Impact of watershed geomorphic char- acteristics on the energy and water budgets, Vol. 7, No. 3, pages 389-394, 2006
Monday, December 1, 2014
Luca Brocca interview on Research Gate
Luca Brocca recently was very much interviewed for one of his achievements about the use of remote sensing in hydrology. He had this smart idea of obtaining rainfall from soil-moisture data. His SM2RAIN is a simple algorithm for estimating rainfall from soil moisture data that you can find in his web page together with other interesting stuff:
Soil as a natural rain gauge: Estimating global rainfall from satellite soil moisture data, available here. He also had the honour of a Nature Research Highlight mention. All of this deserve mention by itself. However, he was so kind to mention me in this recent Research Gate Interview. Thank you Luca !
- MISDc Rainfall-Runoff Model (and MILc)
- Soil Moisture Model
- Neyman-Scott Rectangular Pulse Model
- FARIMA
- Satellite vs in situ soil moisture observations
- Long-term (1989-2012) soil moisture data set
The paper that generate a big wave was:
Sunday, November 30, 2014
H2O - The Java way to Machine Learning
I discover H2O thanks to the R community where I found a post about the connection of this tool with R. Besides the interest for Machine Learning and Statistics (Data Science) which has been increasing during this year (yes, I did not do anything with it: but I need to learn before, and I take years to do it ;-( ), I was intrigued by the fact that it is implemented in Java, is said to be fast, has connection to R, Scala and Python, and has among the advisors Joshua Bloch, so, I guess, it should be REALLY good Java from which to learn. Obviously I did add it to my previous overviews of Java and R tools for hydrologists.
If they are really good as they promised, there are several things that we can copy from them: the connection to R, their class to read data, their strategies for doing math.
The tool is open source, information and download can be found at H2O site. During the H2O conference, the famous statisticians Trevor Hastie and John Chambers gave interesting talks that can be found here.
If they are really good as they promised, there are several things that we can copy from them: the connection to R, their class to read data, their strategies for doing math.
The tool is open source, information and download can be found at H2O site. During the H2O conference, the famous statisticians Trevor Hastie and John Chambers gave interesting talks that can be found here.
Thursday, November 27, 2014
Ning Lu lectures on hillslope processes and (especially) stability, at the Summer School on Landslides
In 2013 University of Calabria organised a very interesting School on Landslide triggering (many thanks to Lino Versace, Giovanna Capparelli and Giuseppe Formetta). I actually gave a hand to organised it, and I also gave a lecture on Richards equation. Waiting for the official post of the lectures at the school site (after which, I will remove my videos), I cannot wait anymore to have on-line the lectures by Ning Lu. He gave four talks taken out of his beautiful book, Hillslope Hydrology and Stability, written with Jonathan Godt, new coordinator of the USGS landslide hazards program, and former co-advisor of my Ph.D. student Silvia Simoni (her thesis here). A must-watch for any guy in the field !
First talk: A brief conceptual history of soil hydrology and soil mechanics (from Chapter 6 of his book)
Second talk: About Steady infiltration (Chapter 3 of his book)
Third talk, part I: Strength of hillslope materials (Chapter 7 of his book)
Third talk, part II: Hydro-mechanical properties of hillslopes (Chapter 8 of the book)
Fourth talk, part I: Failure surfaces (Chapter 9 the book)
Fourth talk, part II: Field based stability analysis (Chapter 10 of his book)
Saturday, November 22, 2014
JGrass-NewAGE history - Version zero and version one
Jgrass-NewAGE (from now on, simply NewAGE) was conceived after the Adige River Authority was requesting a model for t he river Adige to help the managements of droughts. We decide to name the project Nuovo Adige. “Nuovo” since we already implemented a model, almost twenty years before for the same river, and that was the Adige model. Because the English translation of “Nuovo Adige” sounds very much like “new age”, this became soon the name of the model. The first model was implemented mostly by the group head by Alberto Bellin for hydrology and Aronne Armanini for the hydraulic part (I had some part in it, in designing the file organisation required by the model and suggesting some about the geomorphological unit hydrograph approach). The new model, however, had another ancestors: the real-time model operational at the Civil Protection of Province of Trento, also implemented by Alberto Bellin and collaborators, and including a snow model by some of my former students (Hydrologis, and Stefano Endrizzi).
When I got the project, the first thought I did, was that, I had to abandon my beloved GIUH models for a more general one. We needed to estimate the discharges in multiple points of the river network, while GIUH gives the discharge just at the outlet of a basin, and this claimed, at least, for a generalisation of the usual GIUH for a system where multiple GIUH where used, each for any subbasin. (Well GIUH has also some limitations, but I will go back on this in a future post).
If one browse the slides presented at the beginning of the project (In Italian) one could also notice that the emphasis of the project was not only on the models of physical processes, either conceptual or physical, but on the whole infrastructure of modelling, including a data base for storing the data (even the geographic data, of input as well as of output), and a visualisation system based on a GIS system (uDig) aimed to grow into a Decision Support System (DSS).
Since the early nineties, I was in fact intrigued by the image of a DSS system, found in a lost book of proceedings, which included a database, a system of visualisation, and models, and where all of the needed concepts where already developed. Twenty years or so later I am still asking myself why the people who envisioned the picture, did not ever actually realised it in practice: but probably it was because a lot of tools (which I spent year to create consistently) were missing.
Another key in the presentation was the inclusion in modelling of infrastructures like water intakes, withdraw, dams and any other devices. An explicit treatment of flow in channel through a solid algorithm [Casulli, 1990] solving 1D de Saint Venant equation was also implemented.
These features were known to be necessary to account for human action which, potentially, during low flows, could withdraw all river water for for irrigation and other uses [a clarification of the concept of Anthropocene at this local scale].
After the experience made with GEOtop, it was clear that modelling of such systems could not anymore be implemented in the classical way as a monolithic program. We therefore look at OpenMI as a system to pursue a strategy of modelling by components. However, presentation given at CUASHI 2008 biennial meeting (a must read including also some considerations about GEOtop) clarifies these and other design issues.
From the point of view of the process mathematical descriptions, the plan was to “recycle” the snow model of the real time Adige model, to absorb the GEOtransf model [e.g. Majone et al., 2010] into the picture; to implement the estimation of Penman-Monteith (see also here the lectures by Dara Entekhabi) scheme for evapotranspiration, to recover all the tools of the Horton Machine for the geomorphometric analyses required for basin delineation. An ambitious idea was to include directly in modelling, through the use of GEOtools, the formats suitable for a direct GIS representation.
Unfortunately, GEOtransf model was not available, due mainly to licensing issues, and we had to change the direction of our efforts, and we grabbed the Chris Duffy model [Duffy, 1996] as implemented in the Mantilla’s CUENCAS model. A couple of spatial interpolator for data measured at hydro-meteo stations were implemented, i.e. an ordinary Kriging, and Just Another Model Interpolator (JAMI), a simple bare-bone robust (which never break) method for making measure available in any catchment point. Finally an original description of the river networks, describing the topology, and governing the order of execution of the various modules, was implemented (here an early view of the watershed description, given as a Poster at EGU-Wien 2008, and here a more mature paper, just on the generalised Pfafstetter numbering in NewAGE).
We can call this above the zero version of NewAGE, which actually did not became never operational. It needed a testing calibration phase that, for various reasons could not be pursued. The financial support from the River Basin Authority terminated, and data base, model components, and whatever developed was closed in a drawer. A big lost occasion to have a new type of model working on River Adige! What actually remains of version zero has to be archeologically retrieved. But we are doing it.
However, I (we) did not give up. Eventually we abandoned OpenMI in favour of OMS, and the reasons were explained in a previous post. Porting to OMS was done as well by Hydrologis. We also reduced our scope and concentrated just on the model components to improve them and verify their respondence to reality.
To obtain this goal Giuseppe Formetta, in his start of Ph.D. implemented goodness of fit methods (GoF) and the calibration methods DREAM and Particle Swarm that became two new OMS components. With them, a systematic analysis of the other components started.
We soon realised that Duffy’s model was not easy to calibrate (or, at that time, we were maybe not experienced in using the tools), and Giuseppe decided to implement a new entire module, based on Hymod. This was successfully accomplished, and the results are reported in Formetta et al, 2011.
Functional to this work was the incremental improvement of the Kriging (.) components, now including, besides Ordinary Kriging, Detrended Kriging, their local versions (accounting not for all the measurements points but just on the next neighbours), and five variograms models that can be fitted automatically at any time step having available data (also made by Giuseppe).
The refocused development of NewAGE was described in a concept paper actually published this year 2014.
The NewAGE zero had radiation simulation, however, the components was entirely rewritten for the short wave radiation on the basis of Javier Corripio’s work. The implementation was initiated by Daniele Andreis, and enhanced, cleaned, and completed with various contributions on how to estimate radiation attenuation by atmosphere and clouds by Giuseppe Formetta. All of this work is summarised in this GMD paper (for some rehearsal on solar radiation, you can look at the slides referred here).
On the basis of the work on radiation, components for snow accumulation and melting were built and documented in another paper where the study was concerned about the Cache La Poudre basins close to Fort Collins, Co, another piece of Giuseppe Formetta's Ph.D. thesis.
The actual state-of-art of NewAGE includes also the implementation of a new version of the Penman-Monteith, its FAO counterpart, and Priestley-Taylor formulas for the estimation of evapotranspiration (also implemented by Giuseppe F.). The potentialities in the latter components were not yet exploited as they could. But we will do it.
At present, therefore, NewAGE is constituted by a a set of components that can be used to simulated the whole hydrological cycle for any basin, from a few square kilometres to continental scale rivers. We can call the actual version JGrass-NewAGE version 1, but, actually is a set of components that can be arranged in various modeling solutions for various analyses that could not be easily obtainable with more traditional models.
The story is continuing and soon other components will be made available to the public, those available now are listed in a previous post, here. A different view on the same concepts presented here, can be found here (seen from the Visualisation and Informatics perspective) and here (the idea of building a physico-statistical model). If you resist and look the recent presentation given at Fort Collins you can see a rework of the ideas developed at CUASHI in 2008 and complete your overview.
The source code of the system can be found on Github.
The source code of the system can be found on Github.
References
V Casulli, Semi-implicit finite difference methods for the two-dimensional shallow water equations, Journal of Computational Physics 86 (1), 56-74, 1999
CJ Duffy, A Two‐State Integral‐Balance Model for Soil Moisture and Groundwater Dynamics in Complex Terrain, Water resources research 32 (8), 2421-2434, 1996
Formetta, G.; Mantilla, R.; Franceschi, S., Antonello A., Rigon R., The JGrass- NewAge system for forecasting and managing the hydrological budgets at the basin scale: models of flow generation and propagation/routing, Geoscientific Model Development Volume: 4 Issue: 4 Pages: 943-955, DOI: 10.5194/gmd-4- 943-201, 2011
Formetta G., Antonello A., Franceschi S., David O. and Rigon R., The informatics of the hydrological modelling system JGrass-NewAge, 2012 International Congress on Environmental Modelling and Software Managing Resources of a Limited Planet, Sixth Biennial Meeting, Leipzig, Germany R. Seppelt, A.A. Voinov, S. Lange, D. Bankamp (Eds.) http://www.iemss.org/society/index.php/iemss- 2012-proceedings, 2012
Formetta G., Rigon R., Chavez J.L., David O., The short wave radiation model in JGrass-NewAge System, Geosci. Model Dev., 6, 915-928, 2013, www.geosci-model-dev.net/6/915/2013/
doi:10.5194/gmd-6-915-2013
Formetta G., Antonello A., Franceschi S., David O., and Rigon R., Hydrological modelling with components: A GIS-based open-source framework, Environmental Modelling & Software, 5 (2014), 190-200
B Majone, A Bertagnoli, A Bellin, A non-linear runoff generation model in small Alpine catchments, Journal of hydrology 385 (1), 300-312, 2010
Thursday, November 20, 2014
At the Disasters' School of Dr. Unavoidable (after the recent flooding in Italy) - by Stefano Benni
Tagged with #La Repubblica 19 novembre 2014#Stefano Benni
( I am sorry for the bad English, but here it is the satyric article by Stefano Benni)
- Our office is looking for new forms of communication to help the Italians to accept that climate change with patience. For example, we coined the term "water bomb". It is obvious that against the old showers once you could do something, but against a water bomb there is nothing to do. The fault lies with clouds militarized and bellicose. They used to say: here comes the bad weather. Now they say: it comes the cyclone Charon, the anticyclone Polyphemus, hurricane Cynthia. All feel inside an epic event, or like waiting for a nosy friend. Close the door firmly, Cynthia arrives. And. please, stop with ideological speculations, be quantitative! When I say that in a place they fell 200 mm of water, that is, as it usually rains in a month in Caracas, I explain mathematically the misfortune of what happened. It is not true that we do not bustle, we have detection systems at state-of-art and tiring ... you know how much time is lost to collect two hundred millimeters of water with a spoon?
- Climate change is known. Against flooding, illegal buildings, landslides, can't you do prevention?
- Unfortunately we do not have the money for prevention. We need to spend them to repair the damage of what we do not have prevented. If we spent the money for prevention, then we would not have the money to repair the damage -.
- But perhaps preventing there would be not damages ...
- This is a bizarre aspect of the matter, we are studying it. But we do a lot of prevention. For example, in fifty years television weather forecasts increased from three to three thousand a day, and the graphics are much improved. Another example, if homes are built in a geologically hazardous us ...
- Do not allow to build, and move people away -.
- No, we can not intervene, it would take the army. But we immediately say that they are abusive. Then we condone them. Indeed, from now on we think of condone them even before they build illegally. It's not a great idea?
- You think in a strange way. And the flooding?
- We were not ready. Once the rivers "came out of the bed," "inundated", "overflowed", "overrun". But now do a new thing "esondano". We did not expect -.
- C'mon it's the same thing. The Po river "esonda" and floods, but it has already done so many times -.
- Of course, the Po can do it, it is a great river. But now any creek or channel feels entitled to overflows. We can not check them all, it seems that they do it on purpose -.
- And the banks? The work of containment? Reforestation?
- You see, if I have to build Milan Expo or ghost palaces at La Maddalena, I have no obstacles, large contracts must be shipped quickly and, with a little of bribes, everything speeds up. But every time there is a contract for a levee, for a consolidation effort to dredge a river, firms in competition sue, they appeal to the TAR, everything is delayed. It is not our fault. We should entrust the hydrogeological hazard prevention to a pool (in English in the original), or to the mafia or to FIAT, and then things would go quick. But they do not let you do it-.
- So in the future will be even worse?
- It depends on how we see the situation. We are preparing a new approach to scientific and media. First we created the event ω, omega event -.
- What is it?
- The event ω - omega is a type of occurrence very rare and unpredictable. For example, the rain of Genoa, a clash between comets, an internet connection to work properly, a soccer arbitrage without controversy. These exceptional events we can deal with only one way -.
- What's that?
- You see the shape of the omega, what would you remember? We hope in our ass, and above all, we politicians must have a face like an ass -.
- I do not see much of a prevention in it-.
Stefano Benni
( I am sorry for the bad English, but here it is the satyric article by Stefano Benni)
About the latest climate catastrophes of our country, we interviewed an expert little known, but which plays an important role. Is Dr. Unavoidable, responsible of UTMA, Ufficio Tutela Mutamento Ambientale (Protection Bureau against Environmental Change).
- Dr. Unavoidable, first let us define the role of your office. That is clearly to protect the soil and Italian citizens against climate disasters -.
- No, I correct. Our office is responsible for protecting and maintaining the situation of environmental destruction, preventing solutions that would create costly and utopian expectations -.
- Excuse me, but why?
- For many reasons. First, because the environmental change requires adaptation, and as long as the Italian people do not get used to collapses, flooding and landslides they will always be scared and insecure. And because the climate disaster is unavoidable, it becomes necessary a new culture, which is that the inevitability -.
-Do some examples ...
- Our office is looking for new forms of communication to help the Italians to accept that climate change with patience. For example, we coined the term "water bomb". It is obvious that against the old showers once you could do something, but against a water bomb there is nothing to do. The fault lies with clouds militarized and bellicose. They used to say: here comes the bad weather. Now they say: it comes the cyclone Charon, the anticyclone Polyphemus, hurricane Cynthia. All feel inside an epic event, or like waiting for a nosy friend. Close the door firmly, Cynthia arrives. And. please, stop with ideological speculations, be quantitative! When I say that in a place they fell 200 mm of water, that is, as it usually rains in a month in Caracas, I explain mathematically the misfortune of what happened. It is not true that we do not bustle, we have detection systems at state-of-art and tiring ... you know how much time is lost to collect two hundred millimeters of water with a spoon?
- Climate change is known. Against flooding, illegal buildings, landslides, can't you do prevention?
- Unfortunately we do not have the money for prevention. We need to spend them to repair the damage of what we do not have prevented. If we spent the money for prevention, then we would not have the money to repair the damage -.
- But perhaps preventing there would be not damages ...
- This is a bizarre aspect of the matter, we are studying it. But we do a lot of prevention. For example, in fifty years television weather forecasts increased from three to three thousand a day, and the graphics are much improved. Another example, if homes are built in a geologically hazardous us ...
- Do not allow to build, and move people away -.
- No, we can not intervene, it would take the army. But we immediately say that they are abusive. Then we condone them. Indeed, from now on we think of condone them even before they build illegally. It's not a great idea?
- You think in a strange way. And the flooding?
- We were not ready. Once the rivers "came out of the bed," "inundated", "overflowed", "overrun". But now do a new thing "esondano". We did not expect -.
- C'mon it's the same thing. The Po river "esonda" and floods, but it has already done so many times -.
- Of course, the Po can do it, it is a great river. But now any creek or channel feels entitled to overflows. We can not check them all, it seems that they do it on purpose -.
- And the banks? The work of containment? Reforestation?
- You see, if I have to build Milan Expo or ghost palaces at La Maddalena, I have no obstacles, large contracts must be shipped quickly and, with a little of bribes, everything speeds up. But every time there is a contract for a levee, for a consolidation effort to dredge a river, firms in competition sue, they appeal to the TAR, everything is delayed. It is not our fault. We should entrust the hydrogeological hazard prevention to a pool (in English in the original), or to the mafia or to FIAT, and then things would go quick. But they do not let you do it-.
- So in the future will be even worse?
- It depends on how we see the situation. We are preparing a new approach to scientific and media. First we created the event ω, omega event -.
- What is it?
- The event ω - omega is a type of occurrence very rare and unpredictable. For example, the rain of Genoa, a clash between comets, an internet connection to work properly, a soccer arbitrage without controversy. These exceptional events we can deal with only one way -.
- What's that?
- You see the shape of the omega, what would you remember? We hope in our ass, and above all, we politicians must have a face like an ass -.
- I do not see much of a prevention in it-.
- The government does not have to make prevention. It has already has too much to worry with European banks and shopping sprees in dildos. It is the people who must assume their responsibility with regard to climate change. We have forgotten that Homo sapiens comes from water, that we are born anfibi. We must be ready to return to our natural element. In any Italian house there must be at least a raft or a boat, life jackets for everyone and a wetsuit, ("muta" in Italian, word derived from mutare, "to change") and also snorkel and fins. Stop to complain that the subway is flooded! Dive! This means to be good citizens ...
- But it is years we are waiting for a new Hydro-geological Plan -.
- And we have many new ideas. Against the otters that eat away the banks, we will introduce into rivers dozens of crocodiles. Bed and breakfast in the craters of volcanoes will be prohibited. The committee for the earthquake's risk will be replaced by a fortune-teller. Homes where there will be only the fifth floor will be built to prevent flooding. To avoid complaints about delays of the trains, schedules in the stations will be written in Chinese. But above all, from now on, all over the country applies the fuchsia code, which means that we are always in an emergency. If you go by car, on foot, by bike, fuck you. You were warned. - So you think Italians will have to get used to the disasters? - Yes, they will have to happily live them, because they are the unavoidable future. Farewell Mediterranean climate, we have entered the climate Omega. Excuse me, but call me on the phone -.
- Dr. Unavoidable, is your secretary. They say that the road is flooded and your car was swept away ...
- How? But it's a scandal! What has happened?
- Excuse me, but have fallen 132 millimeters of rain has flooded the garage as usual and culverts are clogged -.
- That's enough with this bullshit of millimeters of rain !. Where are the firemen? The culverts clogged, what a scandal! My new Mercedes. What does the government?
- Excuse me doctor, but the government is you, and just told us that we need to adapt to climate omega -.
- Who gives a damn, the car is mine. Where are my fishing boots and the duck-life- jacket ? But in that shitty country we live in? And as for omega event, you know what?
- I can imagine ... thanks for the interview, Dr. Unavoidable.
Stefano Benni
Sunday, November 9, 2014
Design Patterns
Programming the object oriented (OO) way is not simply writing down algorithms that do “for" loops. The core concept to understand, for a OO programmer, is how many classes have to be implemented on the basis of the analysis of the problem under scrutiny, and to be eventually managed by a client (the “main” method in C family of languages) to solve a task. Therefore, a series of questions arise:
An entire discipline, software engineering (SE), was established to find the answers, and various behaviors were codified to improve software writing practice and management (but software engineering also covers the organisation of software production, and the methods to give clear specifications to pariah-programmers for practical coding).
The key actions implied by the answers, however, are not the ones a scientific programmer was used to face: s/he would expect to have a formalised mathematical problem and the scope of her/his work to consist in fact in finding the best (shortest, fastest, cleanest) algorithms to solve it (see the classical Knuth’s books, or the popularisation of Numerical Recipes - I hate their licensing scheme) but not to answer questions about the organisation of the code.
Maintainability and efficiency comes to a cost and OO adds a further level of complexity to the programming practice that scientists are not always ready to face: and it is quite paradoxical that these aspects of SE are not wide recognised as a fundamental task in our times when computer programs have entered in the daily practice of many scientists (see also the introduction to this paper of ours). As a matter of fact bad code practices easily develop in bad science.
Back to the general questions posed at the beginning of the post, proficient programmers observed that, certain problems were recurrents and that some solutions were better than others in term of maintainability and generality. These were called "design patterns”:
"The elements of this language are entities called patterns. Each pattern describes a problem that occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice." (from A Pattern Language by Christopher Alexander, that is said to be inspirational for the the Gang of Four book - see below).
Other definitions of patterns can be fond in the Portland Patterns Repository and in The Hillside group on patterns,.
The patterns idea gained popularity after the book of Gamma, Helm, Johnson & Vlissides, Design Patterns, which actually presents 23 free patterns, grouped in three categories, called
Beyond these initial patterns, other were identified, for instance for parallel computing (see Concurrency patterns), and in other fields.
The patterns style and use (and concept) are not immediate to grasp. They in fact derive from long practicing some specific software issues, and a subsequent conceptualisation and abstraction. As often happens, their abstraction makes them general but quite difficult to be assimilated without going back to develop many examples.
A few guiding ideas are behind the selection of patterns: code should be made easy to modify without large refactoring efforts, and encapsulation of code parts maintained as much as possible. Subclassing maintained to a minimum. An explicit slogan was "program to interface and not to implementation”. Roughly speaking this means: first implement abstract classes or interfaces (in C-family of languages), the, for what is possible, delay the use of concrete classes at run time. Instead of creating subclasses, create other classes to which "delegate responsability”, in order to reduce coupling between classes. The reader is invited to browse the web to understand her/himself what it does mean, advising her/him that explanations are usually full of computer science slang (I think, actually that the language is part of the success of the book).
In any case, the right way to get used to Design Patterns, is, as I said, to use and practice them a lot by trial and error. Java aficionados can have several vulgarisations, many of them can be found on the web [Ava Java].
However, the relevant questions here, in this blog, are:
Certainly there exists hydrologists that apply some patterns for the tasks the patterns were created. I do not know if experienced hydrology programmers apply patterns to some specific of hydrology. Please let me know if you are someone of those, and I will really likely exchange ideas and a (very few) experiences.
In the general science framework, instead, I find a few more references. The first books I can reference are
Other resources come from the paper by the Izaguirre group, and are referenced below.
Further Readings
- Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides: Design Patterns: Elements of Reusable Object-Oriented Software
- Are there strategies for producing a minimum number of classes without loosing functionalities in the work done and promote its extensibility ?
- What goes into a class, and what in other OO features (like methods of a class) ?
- How to create classes with a minimum of generality that can be reused in other problems ?
- How to build classes that can be easily maintained, modified and evolved without disrupting other parts of codes ?
An entire discipline, software engineering (SE), was established to find the answers, and various behaviors were codified to improve software writing practice and management (but software engineering also covers the organisation of software production, and the methods to give clear specifications to pariah-programmers for practical coding).
The key actions implied by the answers, however, are not the ones a scientific programmer was used to face: s/he would expect to have a formalised mathematical problem and the scope of her/his work to consist in fact in finding the best (shortest, fastest, cleanest) algorithms to solve it (see the classical Knuth’s books, or the popularisation of Numerical Recipes - I hate their licensing scheme) but not to answer questions about the organisation of the code.
Maintainability and efficiency comes to a cost and OO adds a further level of complexity to the programming practice that scientists are not always ready to face: and it is quite paradoxical that these aspects of SE are not wide recognised as a fundamental task in our times when computer programs have entered in the daily practice of many scientists (see also the introduction to this paper of ours). As a matter of fact bad code practices easily develop in bad science.
Back to the general questions posed at the beginning of the post, proficient programmers observed that, certain problems were recurrents and that some solutions were better than others in term of maintainability and generality. These were called "design patterns”:
"The elements of this language are entities called patterns. Each pattern describes a problem that occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice." (from A Pattern Language by Christopher Alexander, that is said to be inspirational for the the Gang of Four book - see below).
Other definitions of patterns can be fond in the Portland Patterns Repository and in The Hillside group on patterns,.
The patterns idea gained popularity after the book of Gamma, Helm, Johnson & Vlissides, Design Patterns, which actually presents 23 free patterns, grouped in three categories, called
Beyond these initial patterns, other were identified, for instance for parallel computing (see Concurrency patterns), and in other fields.
The patterns style and use (and concept) are not immediate to grasp. They in fact derive from long practicing some specific software issues, and a subsequent conceptualisation and abstraction. As often happens, their abstraction makes them general but quite difficult to be assimilated without going back to develop many examples.
A few guiding ideas are behind the selection of patterns: code should be made easy to modify without large refactoring efforts, and encapsulation of code parts maintained as much as possible. Subclassing maintained to a minimum. An explicit slogan was "program to interface and not to implementation”. Roughly speaking this means: first implement abstract classes or interfaces (in C-family of languages), the, for what is possible, delay the use of concrete classes at run time. Instead of creating subclasses, create other classes to which "delegate responsability”, in order to reduce coupling between classes. The reader is invited to browse the web to understand her/himself what it does mean, advising her/him that explanations are usually full of computer science slang (I think, actually that the language is part of the success of the book).
In any case, the right way to get used to Design Patterns, is, as I said, to use and practice them a lot by trial and error. Java aficionados can have several vulgarisations, many of them can be found on the web [Ava Java].
However, the relevant questions here, in this blog, are:
- which of the original (as well as other) patterns are useful in scientific programming ?
- Are there any pattern that is characteristic to hydrological problems ?
Certainly there exists hydrologists that apply some patterns for the tasks the patterns were created. I do not know if experienced hydrology programmers apply patterns to some specific of hydrology. Please let me know if you are someone of those, and I will really likely exchange ideas and a (very few) experiences.
In the general science framework, instead, I find a few more references. The first books I can reference are
- Design Patterns for e-Science, by Henry Gardner, Gabriele Manduchi, Springer, 2007 (It covers the building of a system for visualising experiments from plasma physics)
- Damian Rouson (Author), Jim Xia (Author), Xiaofeng Xu (Author), Scientific Software Design: The Object-Oriented Way, 2011 (It covers arguments more close to implementation of algorithms/ Unfortunately, despite the code in C++, it is pretty oriented to FORTRAN)
Other resources come from the paper by the Izaguirre group, and are referenced below.
Further Readings
- Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides: Design Patterns: Elements of Reusable Object-Oriented Software
- Thinking in patterns’ Notes by Bruce Eckel
- [Portland Patterns Repository] - Patterns are the recurring solutions to the problems of design. People learn patterns by seeing them and recall them when need be without a lot of effort. Patterns link together in the mind so that one pattern leads to another and another until familiar problems are solved. That is, patterns form languages, not unlike natural languages, within which the human mind can assemble correct and infinitely varied statements from a small number of elements.
- [The Hillside group on patterns] - Fundamental to any science or engineering discipline is a common vocabulary for expressing its concepts, and a language for relating them together. The goal of patterns within the software community is to create a body of literature to help software developers resolve recurring problems encountered throughout all of software development. Patterns help create a shared language for communicating insight and experience about these problems and their solutions. Formally codifying these solutions and their relationships lets us successfully capture the body of knowledge which defines our understanding of good architectures that meet the needs of their users. Forming a common pattern language for conveying the structures and mechanisms of our architectures allows us to intelligibly reason about them. The primary focus is not so much on technology as it is on creating a culture to document and support sound engineering architecture and design
- [Design Pattern for scientific software] by Dan Gezelter. He pointed to some papers on Design Pattern is scientific computing by the Izaguirre group of which I found [1] and [2]
- Design Patterns in Wikipedia
- [Portland Patterns Repository] - Patterns are the recurring solutions to the problems of design. People learn patterns by seeing them and recall them when need be without a lot of effort. Patterns link together in the mind so that one pattern leads to another and another until familiar problems are solved. That is, patterns form languages, not unlike natural languages, within which the human mind can assemble correct and infinitely varied statements from a small number of elements.
- [The Hillside group on patterns] - Fundamental to any science or engineering discipline is a common vocabulary for expressing its concepts, and a language for relating them together. The goal of patterns within the software community is to create a body of literature to help software developers resolve recurring problems encountered throughout all of software development. Patterns help create a shared language for communicating insight and experience about these problems and their solutions. Formally codifying these solutions and their relationships lets us successfully capture the body of knowledge which defines our understanding of good architectures that meet the needs of their users. Forming a common pattern language for conveying the structures and mechanisms of our architectures allows us to intelligibly reason about them. The primary focus is not so much on technology as it is on creating a culture to document and support sound engineering architecture and design
- [Design Pattern for scientific software] by Dan Gezelter. He pointed to some papers on Design Pattern is scientific computing by the Izaguirre group of which I found [1] and [2]
- Design Patterns in Wikipedia
Friday, October 31, 2014
Research Reproducibility on Nature
Luca Brocca brings to my attention two papers on Research Reproducibility and the policy followed by Nature journal.
They are the editorial "Code Share" and the paper "Open Code for Open Science" by Steve M. Easterbrook
They are the editorial "Code Share" and the paper "Open Code for Open Science" by Steve M. Easterbrook
Thursday, October 30, 2014
Fifth Water Conference Selected Presentations on Impacts of Climate Change on Alpine Regions
As any two years, the Alpine Convention organised the a "Water Conference" to assess the results of the Water Platform, which I had the honour to head in 2013-2014.
The fifth Water Conference was entitled: "Water in the Alps - and beyond; Adapting alpine and mountain river basins to climate change".
The event is going to be jointly promoted and developed by the Alpine Convention and the UNECE Water Convention, in order to favour the creation of synergies and the exchange of experiences (i.e. among the Alpine territory, the Carpathians, the Caucasus, Central Asia,...).
The 5th Water Conference is intended to provide to a wide audience of experts, administrators, practitioners and stakeholders the state of the art, the best practices and the main findings about adaptation to climate change in the mountain trans-boundary river basins.
Different panels of experts will illustrate the main results of the last years of activities in the respective conventions.
Furthermore, updated high-level information on climate change and adaptation strategies will be provided, together with the results of some relevant projects of European Territorial Cooperation on the issues.
Finally, a special focus will be devoted to the implementation of the measures of flood management, in the EU flood directive.
The event is going to be jointly promoted and developed by the Alpine Convention and the UNECE Water Convention, in order to favour the creation of synergies and the exchange of experiences (i.e. among the Alpine territory, the Carpathians, the Caucasus, Central Asia,...).
The 5th Water Conference is intended to provide to a wide audience of experts, administrators, practitioners and stakeholders the state of the art, the best practices and the main findings about adaptation to climate change in the mountain trans-boundary river basins.
Different panels of experts will illustrate the main results of the last years of activities in the respective conventions.
Furthermore, updated high-level information on climate change and adaptation strategies will be provided, together with the results of some relevant projects of European Territorial Cooperation on the issues.
Finally, a special focus will be devoted to the implementation of the measures of flood management, in the EU flood directive.
The whole set of presentations given during the conference can be found here. However, I would like to bring to attention a few outstanding, related to the impacts of climate change on the Water Cycle:
- M. Beniston, Changes in Alpine water resources: examples from the EU «ACQWA» Project^1
- A. Bellin, The impacts of climate change on water cycle
- S. Gruber, The changing mountain cryosphere: impacts on sediment and solute transport
- U. Morra di Cella, Permafrost in the Alps: the experience of PermaNET project “Long-term permafrost monitoring network”
- G. Rosatti, Modelling sediment transport in Alpine rivers subject to climate changes
^1 AQWA Project
Wednesday, October 22, 2014
Breaktroughts lectures at university of Saskatchewan
A remarkable initiative at University of Saskatchewan, has been initiated, under the impulse of Jeff McDonnell. He invited many top hydrological scientists to express their opinion and ideas about various topics in modern hydrologic research. Fortunately, the lectures are subsequently posted on a proper channel in Youtube.
I have to say that I do not share all the ideas presented. However, they constitute a corpus that is useful to know.
The video lectures, can be found here.
I have to say that I do not share all the ideas presented. However, they constitute a corpus that is useful to know.
Thursday, October 9, 2014
A couple of new things from Hydrologis
My former students of Hydrologis, which whom I collaborated in doing the Horton Machine contained in the uDig Spatial Toolbox, came out recently with a few good news.
The first is Stage an application that makes the Spatial Toolbox available alone, in meanwhile uDig migration to Location Tech is ongoing. Besides it offers a way to save and store Geopaparazzi projects in your personal Computer. In the words of hydrologis:
"The new Spatial Toolbox And Geoscripting Environment is a web application based on the RAP.
The RAP ecosystem exploits the Server-Side Equinox project, which integrates an OSGI engine with classic Servlet Techniques. The RAP framework allows for the development of web applications by means of the java language and supplying a subset of the Eclipse RCP libraries and plugins.
Basing on this technology it is possible to run S.T.A.G.E. both locally or remote and execute modules from the JGrasstools library as well as OMS3 annotated java classes. The modules are executed on the serverside and provide progress feedback to the user as the processing proceeds.
The user interface for the modules is generated on the fly from the code annotations.
S.T.A.G.E. opens possibilities for the execution of remote processes. Servlets can be added to execute modules or syncronize data with a central database instance. Modules can be executed via simple http POST requests, data can be analized and filtered from any connected device or platform.
The modular nature of the application makes it possibile to simply enable functionalities by adding plugins to the installation. This allows for a great deal of customization of the application for the exact purpose of the project involved."
A more extensive description of what Stage does, can be found here.
The first is Stage an application that makes the Spatial Toolbox available alone, in meanwhile uDig migration to Location Tech is ongoing. Besides it offers a way to save and store Geopaparazzi projects in your personal Computer. In the words of hydrologis:
"The new Spatial Toolbox And Geoscripting Environment is a web application based on the RAP.
The RAP ecosystem exploits the Server-Side Equinox project, which integrates an OSGI engine with classic Servlet Techniques. The RAP framework allows for the development of web applications by means of the java language and supplying a subset of the Eclipse RCP libraries and plugins.
Basing on this technology it is possible to run S.T.A.G.E. both locally or remote and execute modules from the JGrasstools library as well as OMS3 annotated java classes. The modules are executed on the serverside and provide progress feedback to the user as the processing proceeds.
The user interface for the modules is generated on the fly from the code annotations.
S.T.A.G.E. opens possibilities for the execution of remote processes. Servlets can be added to execute modules or syncronize data with a central database instance. Modules can be executed via simple http POST requests, data can be analized and filtered from any connected device or platform.
The modular nature of the application makes it possibile to simply enable functionalities by adding plugins to the installation. This allows for a great deal of customization of the application for the exact purpose of the project involved."
A more extensive description of what Stage does, can be found here.
The second is Lesto, part of the work of Silvia Franceschi for her Ph.D. at University of Bolzano. A set of tools for extracting features from LIDAR data. Information about Lesto can be found here.
For who interested in GEOpaparazzi, the last tutorial is here.
For getting more information, please contact info <at> hydrologis.com
Wednesday, October 8, 2014
CISLAM
CISLAM is the simplified hydrological model produced by Cristiano Lanni during his P.h. D which was devoted to the study of landslide triggering. The theory behind the code is commented in a Hydrological Processes Paper, and the original code was written in R, but Marco Foi ported it to the Jgrasstools during a fortunate Google Summer of Code.
You can find a jar file ready to be used within a hacked version of JGrassTools 0.7.7: in fact, Marco had to modify a little JGrasstools to get it working. These changes were, so far, never introduced in version 0.8, and therefore for using CISLAM, it is necessary to use this version of JGrasstools:
The tool has been tested to on the data set that can be found here, the same described in the manual. Other tests, would be necessary indeed.
Friday, October 3, 2014
Naming things in hydrological models
Yesterday I could meet with Olaf David, Scott Peckham (update: Scott did a presentation on MBI at the OMS3 summer school in 2016). Scott is a well known scientists either among hydrological modellers than geomorphologists. In the first field because of his recent work on CSDMS project (and his own model Topoflow), in which he was one of the leader scientists, in the latter thanks to his work on river network topology and the construction of Rivertools, one of the best suite of tools for watershed delineation and analysis.
The reason to meet was friendship and just talking and exchanging what we are doing, and the meeting, closed in one (actually two) of the small breweries of Fort Collins, was really successful.
One of the recent things Scott is pursuing is to understand what models have inside, and the approach he took, was to categorise all the variables they contain, and define in a manner as clear as possible. His efforts can be found and well described in here.
“CSDMS asks that contributed models should be provided with a Basic Model Interface (BMI) which includes mapping input and output variable names to CSDMS Standard Names and providing model metadata. … A good introduction to the CSDMS Standard Names is provided by Peckham (2014). A somewhat outdated, high-level overview of the CSDMS Standard Names is also available as aPowerpoint presentation.”
Scott and coworkers did not forgot netCDF parallel effort with its CF convention, but he realised that the coverage of hydrology was poor, and he want to built the vocabulary from scratch. The effort, is by far not useful to his project, but also for other models and infrastructures. With our model GEOtop we started a parallel, and much more limited work, in identifying keywords related to hydrological quantities and to control the model’s workflow (see GEOtop’s manual), and I plan to provide soon a matching between CSDMS names and GEOtop names (and, I will repeat the operation inside my lectures, modifying my slides).
Having a common vocabulary for identify things in models would certainly make easier to choose names for quantities, even if, clearly the internal variable names should be shorter for practical purposes, identify code chunks that treat the same phenomena. Also search model through the web would facilitate with standard names for search.
Here below a brief description of the whole Scott’s effort.
While it is always a good idea to use existing standards whenever possible, CSDMS discovered that other naming conventions, such as the CF Convention Standard Names were not well-suited to the needs of component-based modeling. This section explains our motivation for developing a new standard.
This section provides some background and basic information about the CSDMS Standard Names.
This section provides numerous examples of CSDMS Standard Names, organized by the main object under consideration and its parts or "subobjects".
The CSDMS Standard Names follow an object + quantity pattern with an optional operation prefix applied to the quantity part. This section provides the basic rules for constructing CSDMS Standard Names.
This section provides a set of templates and rules for constructing the object name part of a CSDMS Standard Name.
This section provides a set of templates and rules for constructing the quantity name part of a CSDMS Standard Name. Many quantity names include the name of aphysical process and information about constructing process names along with numerous examples are given on the CSDMS Process Names page.
This section provides a set of templates and rules for constructing the optional operation part of a CSDMS Standard Name.
This section provides information on CSDMS Model Coupling Metadata (MCM) files and provides standardized model/variable metadata names for units, ellipsoids, datums, projections, "how modeled" and assumptions. It links to an extensive set of CSDMS Assumption Names and includes An Example Model Coupling Metadata file.
Thursday, October 2, 2014
Machine Learning
I found this informative blog post on Element of Statistic Learning, a fundamental book by Trevor Hastie and Rob Tibshirani.
I reproduce verbatim the blogpost:
"In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendaryElements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as "machine learning"), largely due to the high quality of both the textbook and the video lectures. And as an R user, it was extremely helpful that they included R code to demonstrate most of the techniques described in the book.
I reproduce verbatim the blogpost:
"In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendaryElements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as "machine learning"), largely due to the high quality of both the textbook and the video lectures. And as an R user, it was extremely helpful that they included R code to demonstrate most of the techniques described in the book.
If you are new to machine learning (and even if you are not an R user), I highly recommend reading ISLR from cover-to-cover to gain both a theoretical and practical understanding of many important methods for regression and classification. It is available as a free PDF download from the authors' website.
If you decide to attempt the exercises at the end of each chapter, there is a GitHub repository of solutions provided by students you can use to check your work.
As a supplement to the textbook, you may also want to watch the excellent course lecture videos(linked below), in which Dr. Hastie and Dr. Tibshirani discuss much of the material. In case you want to browse the lecture content, I've also linked to the PDF slides used in the videos.
Chapter 1: Introduction (slides, playlist)
- Opening Remarks and Examples (18:18)
- Supervised and Unsupervised Learning (12:12)
Chapter 2: Statistical Learning (slides, playlist)
- Statistical Learning and Regression (11:41)
- Curse of Dimensionality and Parametric Models (11:40)
- Assessing Model Accuracy and Bias-Variance Trade-off (10:04)
- Classification Problems and K-Nearest Neighbors (15:37)
- Lab: Introduction to R (14:12)
Chapter 3: Linear Regression (slides, playlist)
- Simple Linear Regression and Confidence Intervals (13:01)
- Hypothesis Testing (8:24)
- Multiple Linear Regression and Interpreting Regression Coefficients (15:38)
- Model Selection and Qualitative Predictors (14:51)
- Interactions and Nonlinearity (14:16)
- Lab: Linear Regression (22:10)
Chapter 4: Classification (slides, playlist)
- Introduction to Classification (10:25)
- Logistic Regression and Maximum Likelihood (9:07)
- Multivariate Logistic Regression and Confounding (9:53)
- Case-Control Sampling and Multiclass Logistic Regression (7:28)
- Linear Discriminant Analysis and Bayes Theorem (7:12)
- Univariate Linear Discriminant Analysis (7:37)
- Multivariate Linear Discriminant Analysis and ROC Curves (17:42)
- Quadratic Discriminant Analysis and Naive Bayes (10:07)
- Lab: Logistic Regression (10:14)
- Lab: Linear Discriminant Analysis (8:22)
- Lab: K-Nearest Neighbors (5:01)
Chapter 5: Resampling Methods (slides, playlist)
- Estimating Prediction Error and Validation Set Approach (14:01)
- K-fold Cross-Validation (13:33)
- Cross-Validation: The Right and Wrong Ways (10:07)
- The Bootstrap (11:29)
- More on the Bootstrap (14:35)
- Lab: Cross-Validation (11:21)
- Lab: The Bootstrap (7:40)
Chapter 6: Linear Model Selection and Regularization (slides, playlist)
- Linear Model Selection and Best Subset Selection (13:44)
- Forward Stepwise Selection (12:26)
- Backward Stepwise Selection (5:26)
- Estimating Test Error Using Mallow's Cp, AIC, BIC, Adjusted R-squared (14:06)
- Estimating Test Error Using Cross-Validation (8:43)
- Shrinkage Methods and Ridge Regression (12:37)
- The Lasso (15:21)
- Tuning Parameter Selection for Ridge Regression and Lasso (5:27)
- Dimension Reduction (4:45)
- Principal Components Regression and Partial Least Squares (15:48)
- Lab: Best Subset Selection (10:36)
- Lab: Forward Stepwise Selection and Model Selection Using Validation Set (10:32)
- Lab: Model Selection Using Cross-Validation (5:32)
- Lab: Ridge Regression and Lasso (16:34)
Chapter 7: Moving Beyond Linearity (slides, playlist)
- Polynomial Regression and Step Functions (14:59)
- Piecewise Polynomials and Splines (13:13)
- Smoothing Splines (10:10)
- Local Regression and Generalized Additive Models (10:45)
- Lab: Polynomials (21:11)
- Lab: Splines and Generalized Additive Models (12:15)
Chapter 8: Tree-Based Methods (slides, playlist)
- Decision Trees (14:37)
- Pruning a Decision Tree (11:45)
- Classification Trees and Comparison with Linear Models (11:00)
- Bootstrap Aggregation (Bagging) and Random Forests (13:45)
- Boosting and Variable Importance (12:03)
- Lab: Decision Trees (10:13)
- Lab: Random Forests and Boosting (15:35)
Chapter 9: Support Vector Machines (slides, playlist)
- Maximal Margin Classifier (11:35)
- Support Vector Classifier (8:04)
- Kernels and Support Vector Machines (15:04)
- Example and Comparison with Logistic Regression (14:47)
- Lab: Support Vector Machine for Classification (10:13)
- Lab: Nonlinear Support Vector Machine (7:54)
Chapter 10: Unsupervised Learning (slides, playlist)
- Unsupervised Learning and Principal Components Analysis (12:37)
- Exploring Principal Components Analysis and Proportion of Variance Explained (17:39)
- K-means Clustering (17:17)
- Hierarchical Clustering (14:45)
- Breast Cancer Example of Hierarchical Clustering (9:24)
- Lab: Principal Components Analysis (6:28)
- Lab: K-means Clustering (6:31)
- Lab: Hierarchical Clustering (6:33)
Interviews (playlist)
- Interview with John Chambers (10:20)
- Interview with Bradley Efron (12:08)
- Interview with Jerome Friedman (10:29)
- Interviews with statistics graduate students (7:44)"
Subscribe to:
Posts (Atom)