Tuesday, May 29, 2018

Do not do statistics if you do not have casual effects in mind

Statistics was believed after the master of the last century to be the science of correlation, not of causation. However it is clear to our contemporary researchers, at least some of them, that interpreting data without any guess about causation can bring to wrong conclusion. Here below, please find an example from: "The book of why: the new science of cause and effect" by Judea Pearl and Dana MacKenzie.
You should first look at the right figure. The scatterplot presents a roughly linear relation between Exercise and Cholesterol in blood. First observation, set this way, we probably have to reverse the axes. In a causal interpretation, it appears that exercise cannot cause cholesterol. On the contrary the cholesterol presence impose to the subjects to exercise more. Or there is something strange in data. More exercise cannot cause, by our normal belief, more cholesterol.
However, this is not actually even the main point. What the right figure suggests is that there is a positive correlation between the two variables: more cholesterol implies more exercise. However, as the left figure reveals, the real situation is not quite true. Because a cause of cholesterol is age, it appears that is reasonable to consider also this variable in the analysis. Then, when we separate the data among ages sets, we can see a further structure in the data and, in each class of age, in fact, the correlation between exercise and cholesterol is reversed. The less you exercise, the higher is your cholesterol. At the same time, the younger you are the less cholesterol you are expected to have in your blood. Now the picture is coherent with our causal expectations. I think there is something to learn. For more technical reader, one can give a look to: Casual Inference in Statistics.

Wednesday, May 23, 2018

Sunday, May 20, 2018

Graphs, Correlation and Causality

A week ago I started to read “The book of why: the science of cause and effects" by Judea Pearl. (see also his website). This is part of my search for new mathematics for describing entangled hydrological and ecological processes (see also this post and links therein). It is a dissemination book and not intended to grow too technical. However, it arrives the moment when understanding technicalities becomes part of the full understanding, if you do not want to remain a tourist of the new knowledge and become instead an active user of it. Pearl says he dedicated most of his research life to these problems and, therefore, pretending to fill the gap in few weeks is a nonsense.
I require more time to go deeper but presently I have no time. Therefore let me take some annotations here for making easier future efforts. 
 
To go to the details, one can go to the more technical book by Pearl himself, Causality. However, it happened I went to browse some chapters of the very good Shalizi's book on statistics. Chapters from the 20th are a reasonable starting point. Shalizi book itself is not fully explicative but a compromise where some theorems are not deminstrated but assumed and make explicit. Nice enough but requiring in any case the appropriate dedication. Shalizi seems to be a voracious reader, and in the bibliography of his chapter 21, he cites some fundamentals work to put in line for a full understanding the topic. His subsequent chapters also enlarge the vision to information theory, and is connections between the science of causal statistics. Cool. While postponing the study and trying to grasp concepts, I fully report the bibliography I came across here below (mostly from Shalizi's).
All of these are also a good reading for those who believe that data science is a practice which springs out from nowhere.


References


  • Chalak, K., & White, H. (2012). Causality, Conditional Independence, and Graphical Separation in Settable Systems. Neural Computation, 1–60.
  • Cover, Thomas M. and Joy A. Thomas (2006). Elements of Information Theory. New York: John Wiley, 2nd edn.
  • Dinno, A. (2017). An Introduction to the Loop Analysis of Qualitatively Specified Complex Causal Systems (pp. 1–23).
  • Guttorp, Peter (1995). Stochastic Modeling of Scientific Data. London: Chapman and Hall.
  • Jordan, Michael I. (ed.) (1998). Learning in Graphical Models, Dordrecht. Kluwer Academic.
  • Kindermann, Ross and J. Laurie Snell (1980). Markov Random Fields and their Ap- plications. Providence, Rhode Island: American Mathematical Society. URL http://www.ams.org/online_bks/conm1/
  • Lauritzen, S.L., Dawid, A.P., Larsen, B.N., Leimer, H.G. (1990), Independence properties of directed Markov fields, Networks, 20, 491-505
  • Lauritzen, S.L. (1996) Graphical Models. New York: Oxford University Press.
  • Loehlin, John C. (1992). Latent Variable Models: An Introduction to Factor, Path, and Structural Analysis. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 2nd edn.
  • Moran, P. A. P. (1961). “Path Coefficients Reconsidered.” Australian Journal of Statis- tics, 3: 87–93. doi:10.1111/j.1467-842X.1961.tb00314.x.
  • Pearl, J. (2000). Causality- Models, Reasoning, and Inference (pp. 1–386). Cambridge University Press.
  • Wright,S., The Method of Path Coefficients. Annals of Mathematical Statistics 5:161-215.
  • Wysocki, W. (1992). “Mathematical Foundations of Multivariate Path Analysis.” Inventiones Mathematicae, 21: 387–397. URL https://eudml.org/doc/263277

Thursday, May 17, 2018

A little on soil physics

This shows the lectures I gave this week on soil and soil water to my class of hydrology.

Soils

Texture and structure of soils

Definitions

Darcy and Buckingham laws

Soil Water retention curves
Hydraulic conductivity
Hydraulic conductivity at saturation
Richards equation (first part)
Solving Richards equation
Richards equation 1d
Information about solving Richards equation with GEOframe tools



Saturday, May 12, 2018

Aqueducts YouTube Videos

Here they are the videos of the aqueducts lectures.

Generalities

Distribution Network Topologies

Distribution Network Equations
Design requirements


Verification of the design
To sum up

Tuesday, May 8, 2018

Adige's Research and publications

This post is to contain research work and studies about river Adige as soon as they come to my attention. Please help me in finding them.


Papers

Master Thesis
Ph.D. Thesis
Books

Saturday, May 5, 2018

GEOframe documentation standards

We announce that, parallel to this blog, it was just opened an OSF project called GEOframe that contains the complete documentation of the various components developed. Information will continue, however, to appear here too.
GEOframe is already a big tree and it will grow more and more. Click on the Figure to access the documentation site. Each OMS component will have its OMS subproject and the subproject itself contains as standard:
  • The link to the component  source code (the URL of Github site where developers and programmer can download the code)
  • Github executable with examples (the Github - GeoframeComponents site where to download a working example including the executables)
  • A link to the Component's documentation  
  • Jupiter Notebooks: illustrating the examples' I/O
  • R: Not available Yet: but the same as above but the same as above in R
  • GEOframe blog page: point to the geoframe.blogspot.com page where is further documented the component (essentially this information should be the one summarised in the Wiki page of the OSF component's page