Luca Brocca brings to my attention two papers on Research Reproducibility and the policy followed by Nature journal.
They are the editorial "Code Share" and the paper "Open Code for Open Science" by Steve M. Easterbrook
My reflections and notes about hydrology and being a hydrologist in academia. The daily evolution of my work. Especially for my students, but also for anyone with the patience to read them.
Friday, October 31, 2014
Thursday, October 30, 2014
Fifth Water Conference Selected Presentations on Impacts of Climate Change on Alpine Regions
As any two years, the Alpine Convention organised the a "Water Conference" to assess the results of the Water Platform, which I had the honour to head in 2013-2014.
The fifth Water Conference was entitled: "Water in the Alps - and beyond; Adapting alpine and mountain river basins to climate change".
The event is going to be jointly promoted and developed by the Alpine Convention and the UNECE Water Convention, in order to favour the creation of synergies and the exchange of experiences (i.e. among the Alpine territory, the Carpathians, the Caucasus, Central Asia,...).
The 5th Water Conference is intended to provide to a wide audience of experts, administrators, practitioners and stakeholders the state of the art, the best practices and the main findings about adaptation to climate change in the mountain trans-boundary river basins.
Different panels of experts will illustrate the main results of the last years of activities in the respective conventions.
Furthermore, updated high-level information on climate change and adaptation strategies will be provided, together with the results of some relevant projects of European Territorial Cooperation on the issues.
Finally, a special focus will be devoted to the implementation of the measures of flood management, in the EU flood directive.
The event is going to be jointly promoted and developed by the Alpine Convention and the UNECE Water Convention, in order to favour the creation of synergies and the exchange of experiences (i.e. among the Alpine territory, the Carpathians, the Caucasus, Central Asia,...).
The 5th Water Conference is intended to provide to a wide audience of experts, administrators, practitioners and stakeholders the state of the art, the best practices and the main findings about adaptation to climate change in the mountain trans-boundary river basins.
Different panels of experts will illustrate the main results of the last years of activities in the respective conventions.
Furthermore, updated high-level information on climate change and adaptation strategies will be provided, together with the results of some relevant projects of European Territorial Cooperation on the issues.
Finally, a special focus will be devoted to the implementation of the measures of flood management, in the EU flood directive.
The whole set of presentations given during the conference can be found here. However, I would like to bring to attention a few outstanding, related to the impacts of climate change on the Water Cycle:
- M. Beniston, Changes in Alpine water resources: examples from the EU «ACQWA» Project^1
- A. Bellin, The impacts of climate change on water cycle
- S. Gruber, The changing mountain cryosphere: impacts on sediment and solute transport
- U. Morra di Cella, Permafrost in the Alps: the experience of PermaNET project “Long-term permafrost monitoring network”
- G. Rosatti, Modelling sediment transport in Alpine rivers subject to climate changes
^1 AQWA Project
Wednesday, October 22, 2014
Breaktroughts lectures at university of Saskatchewan
A remarkable initiative at University of Saskatchewan, has been initiated, under the impulse of Jeff McDonnell. He invited many top hydrological scientists to express their opinion and ideas about various topics in modern hydrologic research. Fortunately, the lectures are subsequently posted on a proper channel in Youtube.
I have to say that I do not share all the ideas presented. However, they constitute a corpus that is useful to know.
The video lectures, can be found here.
I have to say that I do not share all the ideas presented. However, they constitute a corpus that is useful to know.
Thursday, October 9, 2014
A couple of new things from Hydrologis
My former students of Hydrologis, which whom I collaborated in doing the Horton Machine contained in the uDig Spatial Toolbox, came out recently with a few good news.
The first is Stage an application that makes the Spatial Toolbox available alone, in meanwhile uDig migration to Location Tech is ongoing. Besides it offers a way to save and store Geopaparazzi projects in your personal Computer. In the words of hydrologis:
"The new Spatial Toolbox And Geoscripting Environment is a web application based on the RAP.
The RAP ecosystem exploits the Server-Side Equinox project, which integrates an OSGI engine with classic Servlet Techniques. The RAP framework allows for the development of web applications by means of the java language and supplying a subset of the Eclipse RCP libraries and plugins.
Basing on this technology it is possible to run S.T.A.G.E. both locally or remote and execute modules from the JGrasstools library as well as OMS3 annotated java classes. The modules are executed on the serverside and provide progress feedback to the user as the processing proceeds.
The user interface for the modules is generated on the fly from the code annotations.
S.T.A.G.E. opens possibilities for the execution of remote processes. Servlets can be added to execute modules or syncronize data with a central database instance. Modules can be executed via simple http POST requests, data can be analized and filtered from any connected device or platform.
The modular nature of the application makes it possibile to simply enable functionalities by adding plugins to the installation. This allows for a great deal of customization of the application for the exact purpose of the project involved."
A more extensive description of what Stage does, can be found here.
The first is Stage an application that makes the Spatial Toolbox available alone, in meanwhile uDig migration to Location Tech is ongoing. Besides it offers a way to save and store Geopaparazzi projects in your personal Computer. In the words of hydrologis:
"The new Spatial Toolbox And Geoscripting Environment is a web application based on the RAP.
The RAP ecosystem exploits the Server-Side Equinox project, which integrates an OSGI engine with classic Servlet Techniques. The RAP framework allows for the development of web applications by means of the java language and supplying a subset of the Eclipse RCP libraries and plugins.
Basing on this technology it is possible to run S.T.A.G.E. both locally or remote and execute modules from the JGrasstools library as well as OMS3 annotated java classes. The modules are executed on the serverside and provide progress feedback to the user as the processing proceeds.
The user interface for the modules is generated on the fly from the code annotations.
S.T.A.G.E. opens possibilities for the execution of remote processes. Servlets can be added to execute modules or syncronize data with a central database instance. Modules can be executed via simple http POST requests, data can be analized and filtered from any connected device or platform.
The modular nature of the application makes it possibile to simply enable functionalities by adding plugins to the installation. This allows for a great deal of customization of the application for the exact purpose of the project involved."
A more extensive description of what Stage does, can be found here.
The second is Lesto, part of the work of Silvia Franceschi for her Ph.D. at University of Bolzano. A set of tools for extracting features from LIDAR data. Information about Lesto can be found here.
For who interested in GEOpaparazzi, the last tutorial is here.
For getting more information, please contact info <at> hydrologis.com
Wednesday, October 8, 2014
CISLAM
CISLAM is the simplified hydrological model produced by Cristiano Lanni during his P.h. D which was devoted to the study of landslide triggering. The theory behind the code is commented in a Hydrological Processes Paper, and the original code was written in R, but Marco Foi ported it to the Jgrasstools during a fortunate Google Summer of Code.
You can find a jar file ready to be used within a hacked version of JGrassTools 0.7.7: in fact, Marco had to modify a little JGrasstools to get it working. These changes were, so far, never introduced in version 0.8, and therefore for using CISLAM, it is necessary to use this version of JGrasstools:
The tool has been tested to on the data set that can be found here, the same described in the manual. Other tests, would be necessary indeed.
Friday, October 3, 2014
Naming things in hydrological models
Yesterday I could meet with Olaf David, Scott Peckham (update: Scott did a presentation on MBI at the OMS3 summer school in 2016). Scott is a well known scientists either among hydrological modellers than geomorphologists. In the first field because of his recent work on CSDMS project (and his own model Topoflow), in which he was one of the leader scientists, in the latter thanks to his work on river network topology and the construction of Rivertools, one of the best suite of tools for watershed delineation and analysis.
The reason to meet was friendship and just talking and exchanging what we are doing, and the meeting, closed in one (actually two) of the small breweries of Fort Collins, was really successful.
One of the recent things Scott is pursuing is to understand what models have inside, and the approach he took, was to categorise all the variables they contain, and define in a manner as clear as possible. His efforts can be found and well described in here.
“CSDMS asks that contributed models should be provided with a Basic Model Interface (BMI) which includes mapping input and output variable names to CSDMS Standard Names and providing model metadata. … A good introduction to the CSDMS Standard Names is provided by Peckham (2014). A somewhat outdated, high-level overview of the CSDMS Standard Names is also available as aPowerpoint presentation.”
Scott and coworkers did not forgot netCDF parallel effort with its CF convention, but he realised that the coverage of hydrology was poor, and he want to built the vocabulary from scratch. The effort, is by far not useful to his project, but also for other models and infrastructures. With our model GEOtop we started a parallel, and much more limited work, in identifying keywords related to hydrological quantities and to control the model’s workflow (see GEOtop’s manual), and I plan to provide soon a matching between CSDMS names and GEOtop names (and, I will repeat the operation inside my lectures, modifying my slides).
Having a common vocabulary for identify things in models would certainly make easier to choose names for quantities, even if, clearly the internal variable names should be shorter for practical purposes, identify code chunks that treat the same phenomena. Also search model through the web would facilitate with standard names for search.
Here below a brief description of the whole Scott’s effort.
While it is always a good idea to use existing standards whenever possible, CSDMS discovered that other naming conventions, such as the CF Convention Standard Names were not well-suited to the needs of component-based modeling. This section explains our motivation for developing a new standard.
This section provides some background and basic information about the CSDMS Standard Names.
This section provides numerous examples of CSDMS Standard Names, organized by the main object under consideration and its parts or "subobjects".
The CSDMS Standard Names follow an object + quantity pattern with an optional operation prefix applied to the quantity part. This section provides the basic rules for constructing CSDMS Standard Names.
This section provides a set of templates and rules for constructing the object name part of a CSDMS Standard Name.
This section provides a set of templates and rules for constructing the quantity name part of a CSDMS Standard Name. Many quantity names include the name of aphysical process and information about constructing process names along with numerous examples are given on the CSDMS Process Names page.
This section provides a set of templates and rules for constructing the optional operation part of a CSDMS Standard Name.
This section provides information on CSDMS Model Coupling Metadata (MCM) files and provides standardized model/variable metadata names for units, ellipsoids, datums, projections, "how modeled" and assumptions. It links to an extensive set of CSDMS Assumption Names and includes An Example Model Coupling Metadata file.
Thursday, October 2, 2014
Machine Learning
I found this informative blog post on Element of Statistic Learning, a fundamental book by Trevor Hastie and Rob Tibshirani.
I reproduce verbatim the blogpost:
"In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendaryElements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as "machine learning"), largely due to the high quality of both the textbook and the video lectures. And as an R user, it was extremely helpful that they included R code to demonstrate most of the techniques described in the book.
I reproduce verbatim the blogpost:
"In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendaryElements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as "machine learning"), largely due to the high quality of both the textbook and the video lectures. And as an R user, it was extremely helpful that they included R code to demonstrate most of the techniques described in the book.
If you are new to machine learning (and even if you are not an R user), I highly recommend reading ISLR from cover-to-cover to gain both a theoretical and practical understanding of many important methods for regression and classification. It is available as a free PDF download from the authors' website.
If you decide to attempt the exercises at the end of each chapter, there is a GitHub repository of solutions provided by students you can use to check your work.
As a supplement to the textbook, you may also want to watch the excellent course lecture videos(linked below), in which Dr. Hastie and Dr. Tibshirani discuss much of the material. In case you want to browse the lecture content, I've also linked to the PDF slides used in the videos.
Chapter 1: Introduction (slides, playlist)
- Opening Remarks and Examples (18:18)
- Supervised and Unsupervised Learning (12:12)
Chapter 2: Statistical Learning (slides, playlist)
- Statistical Learning and Regression (11:41)
- Curse of Dimensionality and Parametric Models (11:40)
- Assessing Model Accuracy and Bias-Variance Trade-off (10:04)
- Classification Problems and K-Nearest Neighbors (15:37)
- Lab: Introduction to R (14:12)
Chapter 3: Linear Regression (slides, playlist)
- Simple Linear Regression and Confidence Intervals (13:01)
- Hypothesis Testing (8:24)
- Multiple Linear Regression and Interpreting Regression Coefficients (15:38)
- Model Selection and Qualitative Predictors (14:51)
- Interactions and Nonlinearity (14:16)
- Lab: Linear Regression (22:10)
Chapter 4: Classification (slides, playlist)
- Introduction to Classification (10:25)
- Logistic Regression and Maximum Likelihood (9:07)
- Multivariate Logistic Regression and Confounding (9:53)
- Case-Control Sampling and Multiclass Logistic Regression (7:28)
- Linear Discriminant Analysis and Bayes Theorem (7:12)
- Univariate Linear Discriminant Analysis (7:37)
- Multivariate Linear Discriminant Analysis and ROC Curves (17:42)
- Quadratic Discriminant Analysis and Naive Bayes (10:07)
- Lab: Logistic Regression (10:14)
- Lab: Linear Discriminant Analysis (8:22)
- Lab: K-Nearest Neighbors (5:01)
Chapter 5: Resampling Methods (slides, playlist)
- Estimating Prediction Error and Validation Set Approach (14:01)
- K-fold Cross-Validation (13:33)
- Cross-Validation: The Right and Wrong Ways (10:07)
- The Bootstrap (11:29)
- More on the Bootstrap (14:35)
- Lab: Cross-Validation (11:21)
- Lab: The Bootstrap (7:40)
Chapter 6: Linear Model Selection and Regularization (slides, playlist)
- Linear Model Selection and Best Subset Selection (13:44)
- Forward Stepwise Selection (12:26)
- Backward Stepwise Selection (5:26)
- Estimating Test Error Using Mallow's Cp, AIC, BIC, Adjusted R-squared (14:06)
- Estimating Test Error Using Cross-Validation (8:43)
- Shrinkage Methods and Ridge Regression (12:37)
- The Lasso (15:21)
- Tuning Parameter Selection for Ridge Regression and Lasso (5:27)
- Dimension Reduction (4:45)
- Principal Components Regression and Partial Least Squares (15:48)
- Lab: Best Subset Selection (10:36)
- Lab: Forward Stepwise Selection and Model Selection Using Validation Set (10:32)
- Lab: Model Selection Using Cross-Validation (5:32)
- Lab: Ridge Regression and Lasso (16:34)
Chapter 7: Moving Beyond Linearity (slides, playlist)
- Polynomial Regression and Step Functions (14:59)
- Piecewise Polynomials and Splines (13:13)
- Smoothing Splines (10:10)
- Local Regression and Generalized Additive Models (10:45)
- Lab: Polynomials (21:11)
- Lab: Splines and Generalized Additive Models (12:15)
Chapter 8: Tree-Based Methods (slides, playlist)
- Decision Trees (14:37)
- Pruning a Decision Tree (11:45)
- Classification Trees and Comparison with Linear Models (11:00)
- Bootstrap Aggregation (Bagging) and Random Forests (13:45)
- Boosting and Variable Importance (12:03)
- Lab: Decision Trees (10:13)
- Lab: Random Forests and Boosting (15:35)
Chapter 9: Support Vector Machines (slides, playlist)
- Maximal Margin Classifier (11:35)
- Support Vector Classifier (8:04)
- Kernels and Support Vector Machines (15:04)
- Example and Comparison with Logistic Regression (14:47)
- Lab: Support Vector Machine for Classification (10:13)
- Lab: Nonlinear Support Vector Machine (7:54)
Chapter 10: Unsupervised Learning (slides, playlist)
- Unsupervised Learning and Principal Components Analysis (12:37)
- Exploring Principal Components Analysis and Proportion of Variance Explained (17:39)
- K-means Clustering (17:17)
- Hierarchical Clustering (14:45)
- Breast Cancer Example of Hierarchical Clustering (9:24)
- Lab: Principal Components Analysis (6:28)
- Lab: K-means Clustering (6:31)
- Lab: Hierarchical Clustering (6:33)
Interviews (playlist)
- Interview with John Chambers (10:20)
- Interview with Bradley Efron (12:08)
- Interview with Jerome Friedman (10:29)
- Interviews with statistics graduate students (7:44)"
Subscribe to:
Posts (Atom)