Thursday, February 7, 2013

About doing a Ph.D.

Doing a Ph.D. is a totally absorbing activity, but it can be one of the most exciting periods of your life. However, I realize that students do not have usually the idea of what it is about.
Actually I put a brief post times ago about what is the core of Ph.D. studies, which is well summarised by the  picture below:
The Figure and the idea is by Matt Might whose blog is a nice reading experience (here the full story).
Recently, I discover that my colleague Davide Geneletti usually provides his students with a set of links of useful readings.  In my best tradition of robber I post them here below with some addition, since I found them  quite informative. They comes from authors from various disciplines  and roles, but their contents are of quite general interest. So here it is the list:



Last but not least he suggest the reading  of the book which I found amusing:

that more or less complete the picture. I would not avoid also to give a look to these "slides offering advice that is wickedly and memorably to the point"

The web is full of other good link and you can certainly find your favorite web site or blog. A suggestion do not hesitate too much on this stuff: eventually you have a Ph.D to pursue ;-)

Friday, February 1, 2013

Paraglacial geomorphology

We often talk about landscape evolution (and I have a little history on the subject). However, we usually forget that from 65Ky and 12Ky of years ago a lot of our Earth was covered by a glaciation. The image below, robbed from a Mr. Kurt Werth presentation at "At North of Trento and South of Bolzano"(1,2,3) Meeting, illustrates the situation in the place where I live, the river Adige basin.
I do not know which is the precision of the map, but it illustrates clearly which was the geneal situation. Cause of this,  many of the geomorphic features we see nowadays where created by the glacier retreat and by subsequent land-sculpting. Alluvial fans (some hundreds of them) were formed later. Big rock avalanches (according also to isotopic measurements) crumbled down among 10Ky and 3Ky ago.
Therefore it is time that  alpine geomorphology (and modelling) put Paraglacial situations inside its horizon. Actual landslides and sediment production is strongly affected by what the glaciers left.

Reference
Ballantyne, C.K. - Paraglacial geomorphology, Quaternary Science Reviews 21 (2002) 1935–2017




Tuesday, January 29, 2013

The law of small numbers

I did know the law of large numbers (and its violations) but I never reflected about the law of small numbers.

You can learn about following this link. It is mostly about Poisson distribution which is, indeed ubiquitous also in Hydrology. So the reading of this R-related post is certainly interesting and useful also for us.


No code No paper

This is entirely from: Simply Statistics » R, and I completely agree with it. It applies the very same way to hydrological literature.

"I think it has been beat to death that the incentives in academia lean heavily toward producing papers and less toward producing/maintaining software. There are people that are way, way more knowledgeable than me about building and maintaining software. For example, Titus Brown hit a lot of the key issues in his interview. The open source community is also filled with advocates and researchers who know way more about this than I do.


This post is more about my views on changing the perspective of code/software in the data analysis community. I have been frustrated often with statisticians and computer scientists who write papers where they develop new methods and seem to demonstrate that those methods blow away all their competitors. But then no software is available to actually test and see if that is true. Even worse, sometimes I just want to use their method to solve a problem in our pipeline, but I have to code it from scratch!

I have also had several cases where I emailed the authors for their software and they said it “wasn’t fit for distribution” or they “don’t have code” or the “code can only be run on our machines”. I totally understand the first and last, my code isn’t always pretty (I have zero formal training in computer science so messy code is actually the most likely scenario) but I always say, “I’ll take whatever you got and I’m willing to hack it out to make it work”. I often still am turned down.

So I have a new policy when evaluating CV’s of candidates for jobs, or when I’m reading a paper as a referee. If the paper is about a new statistical method or machine learning algorithm and there is no software available for that method – I simply mentally cross it off the CV. If I’m reading a data analysis and there isn’t code that reproduces their analysis – I mentally cross it off. In my mind, new methods/analyses without software are just vapor ware. Now, you’d definitely have to cross a few papers off my CV, based on this principle. I do that. But I’m trying really hard going forward to make sure nothing gets crossed off.

In a future post I’ll talk about the new issue I’m struggling with – maintaing all that software I’m creating."

Wednesday, January 23, 2013

Object Modelling System Resources

As the readers know from previous posts we  (I and my collaborators and students) use OMS3 (and we will use even more in the future embedding in it all of our modelling efforts) in collaboration with OMS3 developer in chief Olaf David and others.  Any involvement with OMS must  start with browsing the OMS3 web site and the information available there (for instance, but not only, this).
For using it, first  download the console,  then read the installation notes, and read console FAQ (well we will provide a brief description of its use soon) which remain the main information about the tool.

However, during the BioMA summer school, Olaf, Jim Ascough, Jack Carlson and Giuseppe Formetta gave some further material, which finally you can find below.

Jgrasstools use OMS3 (even if a version older than 3.1) and one can find relevant information also browsing their site. 

Other examples of using OMS3 console and scripting will follow soon.

Monday, January 21, 2013

PostgreSQL your data

Science is matter of hypothesis and data. Hypotheses becomes formal models and then you have to  acquire data to prove  (a big word indeed) them at the feeble light of statistics. At the beginning you start to colelct data everywhere in your hard disk (I assume that the data were digitised). After a few months you are submerged by them. You thrown them away and restart it from the beginning again. 
Fortunately some institution store the data in databases and the reboot is relatively easy. However, this cover the primary data sets, and does not cover the data that yourself produce by  running your models and doing your inferences.
So, sooner or later, you have to face the reality that you should store your stuff in a more ordered way, and build your own database.  This opens various questions. It is really necessary to use a database software (C'mon learning another tool!) ? Obviously not: a database, in its general meaning can be just an ordered set of data. So you can just use your filesystem for it (I say it: but I do not really believe it). However, then you have to remind where the data are, and use the search utility of your operating system to find what you are searching (assumed that you documented every step you made in a searchable way).
Databases helps to do that and often use a query language (usually SQL) that helps to find and select the data you need again and again.  So, at a certain moment, one has to take seriously the hypothesis to use a database.
Nowadays there exist many free and open source database solutions (besides, obviously, to the commercial solutions, Oracle's, IBM's and others). Among the most diffuse I cite MySQLPostgresSQL and H2. Each one is a valid choice, with different characteristics.
In the last years we focused our attention on PostgreSQL for its completeness and for having been the first to include the way to manage geographic (geometric data) as shapefile^*. This is done actually by a plugin, called PostGIS, developed by the same Refraction guys who also promote uDig.
Alban De Lavenne, a Ph.D. students from Rennes Agricampus, who spent a few months among us, gave a talk about the use he does of PostgreSQL for supporting his research. His presentation is, as usual on slideshare (I am working to provide the data to run his examples).

The first step is certainly to install PostGIS. The first time I (am a Mac guy and) used the Kyngofchaos instructions  for installing Postgress. However, I noticed that nowadays there are various other possibilities, supported in the main PostgreSQL page.

Alban instructions and suggestions follow the installations and cover some typical hydrological problems.  For a complete understanding, certainly the Tutorial at PostgreSQL site can help.  Around the web, one can also find other video tutorials, as this one provided by David Fetter, or this comprehensive set  on ITunesU screencasts by Selena Deckelmann and others.

Obviously I am open to any contribution to improve this post.

^* - Recently PostgreSQl/PostGIS acquired the capability to store and manage "raster data" and images, which makes it even more appealing.

Monday, January 14, 2013

Alpine Convention

Tha Alpine Convention is an international treaty between the Alpine countries (Austria, France, Germany, Italy, Liechtenstein, Monaco, Slovenia and Switzerland) as well as the EU, aimed at promoting sustainable development in the Alpine area and at protecting the interests of the people living within it.  It embraces the environmental, social, economic and cultural dimensions.
The following slides (in Italian) clarify the structure of the Convention and try to delineate the way the Water Platform works from  a couple of seminars that Andrea Bianchini gave here in Trento last week. Here the Wikipedia page.

For 2013 and 2014, under the Italian Presidency of the Convention, I will be the president of the Water Platform. The official mandate for 201-2014 of the platform has already stipulated among the participants  (you have a synthesis on slideshare): you can have in Italian (an official version) and an unofficial version in English (english is not among the languages of the Alpine Regions).
The official website of the platform is here.