Showing posts with label Time series. Show all posts
Showing posts with label Time series. Show all posts

Wednesday, August 11, 2021

Causal Inferences and times series

 In the last post,  In the last post, moved by the necessity to compare time series, I browsed literature and my library of papers to find solutions to my needs (essentially I tried to understand if two time series are related by a time lag). In the search I found other things and my literature grew beyond the original scope. One direction that actually I had frequented previously was the one of distinguishing causal connections beyond just correlations. In my previous searches, I has been fascinated by the work by Judea Pearl and part of the findings inherited from his work. The theory of Pearl has been expressed in various part, including the 2000 paper and some books, that you can find in the references. His teachings were directly absorbed by Hannart and Noveau, themselves good statisticians working in climatology, who produced some papers (2016, 2017) using Judea's theory and notation.


The idea can be generalized fro two to multiple time series, as Eichler (2013) actually shows. Eichler actually is know to have produced such analysis in 2003. A trend of more recent paper on the topic are represented by  Jacob Runge (GS)  work , who also have the merit to have implemented and shared his TiGraMITe package. He also has got a prestigious ERC research program on this topic called Causal Earth. On the concepts he wants to develop in the ERC, he gave talks and produced may interesting papers, among those one in Nature and a second on in Science affiliated Journal (see below)

Because we like to do calculation not just read of write abstractly, we can find relief in the Causality handbook that can be a viable (open source) way to put in practice some of the ideas seen in the previous papers. A final, honorable mention goes also the the San Lian (2014) paper.

References

Dahlhaus, Rainer, and Michael Eichler. 2003. “Causality and Graphical Models in Time Series Analysis.Oxford Statistical Science Series, 115–37.

Eichler, Michael. 2013. “Causal Inference with Multiple Time Series: Principles and Problems.Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences 371 (1997): 20110613.

Hannart, A., J. Pearl, F. E. L. Otto, P. Naveau, and M. Ghil. 2016. “Causal Counterfactual Theory for the Attribution of Weather and Climate-Related Events.” Bulletin of the American Meteorological Society 97 (1): 99–110.

Hannart, A., and P. Naveau. 2017. “Probabilities of Causation of Climate Change.arXiv.v, December, 1–54.

Pearl, Judea, 2000. “Models, Reasoning and Inference.Cambridge, UK: CambridgeUniversityPress.

Runge, Jakob, Sebastian Bathiany, Erik Bollt, Gustau Camps-Valls, Dim Coumou, Ethan Deyle, Clark Glymour, et al. 2019. “Inferring Causation from Time Series in Earth System Sciences.Nature Communications 10 (1): 2553.

Runge, Jakob, Peer Nowack, Marlene Kretschmer, Seth Flaxman, and Dino Sejdinovic. 2019. “Detecting and Quantifying Causal Associations in Large Nonlinear Time Series Datasets.Science Advances 5 (11): eaau4996.

San Liang, X. 2014. “Causality between Time Series.arXiv [stat.ME]. arXiv. http://arxiv.org/abs/1403.6496.

Relations between 2 time series (thinking to rainfall-runoff)

 In investigating hydrological quantities, one interesting issue is to understand if two time series are correlated and especially if the correlation comes with a lag time, and, in case which is this lag time. This is nothing different than in many other analysis and, in fact the tools developed are ubiquitous in science. Looking for the how to correlate rainfall and discharges I stumbled in this ready-made post, "Four ways to quantify synchrony between time series data" by Jin Hyun Cheong, PhD.

The added value of this post is that the tools described are also available as open source Python scripts embedded in Jupyter Notebook and therefore anybody can re-execute them easily and learn as they work. I believe that when you go to apply the notebook to your data set it will not be hassle-free. However it is a good starting point. Certainly also you'll have to dig a little in literature to get the sense of what you were doing but this is a great starting point for those who needs to cope with this type of analyses. Jin Hyun material is available on OSF. Please, if you use it, cite it.  

A second way to see their relation is to use the Kullback-Leibler mutual information, a concept derived form Information Theory (see also here) that you can find a little illustrated in the Veyrat-Charvillon and Standaert (2009) paper cited below. Here a notebook that teaches how to estimate it in Python using pyTOrch. Here a bottom-up calculation with standard Python.

The above time series analysis performed are quite interesting because they can also suggest new type of comparison between modelled and simulated time series if you start to get bored by the standard indicators of goodness of fit, like Kling-Gupta-Efficiency and Nash-Shutcliffe

If your main focus is the rainfall-runoff times series relationships, a recent paper to mention, is the one by Giani et al, below in References. But also the work of Serinaldi and Kilsby (2013) that seems quite complicate (boring or interesting? I still do not have read it) contains information. 

References

Giani, G., M. A. Rico‐Ramirez, and R. A. Woods. 2021. “A Practical, Objective, and Robust Technique to Directly Estimate Catchment Response Time.” Water Resources Research 57 (2). https://doi.org/10.1029/2020wr028201.  

Veyrat-Charvillon, Nicolas, and François-Xavier Standaert. 2009. “Mutual Information Analysis: How, When and Why?” In Cryptographic Hardware and Embedded Systems - CHES 2009, 429–43. Springer Berlin Heidelberg.

Serinaldi, Francesco, and Chris G. Kilsby. 2013. “The Intrinsic Dependence Structure of Peak, Volume, Duration, and Average Intensity of Hyetographs and Hydrographs.” Water Resources Research 49 (6): 3423–42.