The zero lecture on my hydrology class. The whole picture of the class is here instead.
My reflections and notes about hydrology and being a hydrologist in academia. The daily evolution of my work. Especially for my students, but also for anyone with the patience to read them.
Tuesday, February 27, 2018
Monday, February 26, 2018
Python general resources (for Hydrologists) to start with
There are a lot of resources to start with python, but for hydrologists, but here I tried, at least at the beginning, a list of readings to be quickly operative.
with a preference for the first one.
- A nice and free general introduction is A Whirlwind Tour of Python either as a book or a collection of Jupyter Notebooks. For further reading and going deeper into some of the main libraries in Python, the same author published Python data science handbook. Unfortunately, this latter book is not for free.
- Jupyter notebooks are a splendid way to organise calculations: you have first to lear how to use them (here a manual: it contains instructions for installation too).
- Lectures on scientific computing with Python by J.R. Johansson cover the main topics very nicely. the first four of more general interest:
- Lecture-0 Scientific Computing with Python (This is a general lecture)
- Lecture-1 Introduction to Python Programming (Just basic notions on programming)
- Lecture-2 Numpy - multidimensional data arrays (Array and vectors)
- Lecture-3 Scipy - Library of scientific algorithms (Doing science with Python)
- Lecture-4 Matplotlib - 2D and 3D plotting (No plots, no science: quite a general introduction)
- For Italians, my own introductory lectures have their place, I believe also because I used Jupyter notebooks (and Python 3) to convey previous work by Joseph Eschgfaeller (translated from Python 2.7).
- Una introduzione gentile al Python scripting (mostly a translation from JE lectures)
- Esperimenti nella lettura di un file
- Leggere un file con PANDAS (e plot dei dati con Matplotlib)
- Scipy Lecture Notes is a good (not necessarily quick) starting. The html version supports hyperlinks that the pdf one does not. You can download the chapters in pdf and at their end, you can download the Jupyter notebooks with the exercises. You can also find part of the exercises here below
- From Wes McKinney (creator of Pandas) Python for data analysis book:
- Chapter 2: Python Language Basics, IPython, and Jupyter Notebooks
- Chapter 3: Built-in Data Structures, Functions, and Files
- Chapter 4: NumPy Basics: Arrays and Vectorized Computation
- Chapter 5: Getting Started with pandas
- Chapter 6: Data Loading, Storage, and File Formats
- Chapter 7: Data Cleaning and Preparation
- Chapter 8: Data Wrangling: Join, Combine, and Reshape
- Chapter 9: Plotting and Visualization
- Chapter 10: Data Aggregation and Group Operations
- Chapter 11: Time Series
- Chapter 12: Advanced pandas
- Chapter 13: Introduction to Modeling Libraries in Python
- Chapter 14: Data Analysis Examples
- Appendix A: Advanced NumPy
- Kevin Sheppard's introduction to statistical analysis with Python also a manuscript to read.
Other resources can be:
- The main NumPy and SciPy documentation.
- Python Scientific Lecture Notes a comprehensive set of tutorials on the scientific Python ecosystem.
- Software Carpentry is an open source course on basic software development skills for people with backgrounds in science, engineering, and medicine.
- Introduction to Statistics an introduction to the basic statistical concepts, combined with a complete set of application examples for the statistical data analysis with Python (by T. Haslwanter).
- They suggests Dive into Python (EN, IT) as a good starting book, and I think it is
- Python Crash course is also a traditional type of book that covers all the traditional element of programming. It is certainly a good book but I would say that its approach does not cope exactly with the modern ‘hip’ approach that a scientist has when using python with more high level infrastructures as Pandas (and partially Numpy and Scipy are) which, for instance often do not require to explicit iterations with “for” loops.
- Think Python (EN, IT) is also an alternative to previous books
- A little different is Thinking in Python, because it is oriented to introduce topics of object oriented (OO) programming which are not usually covered in elementary programming books. Even if OO at its core is to get used to design patterns.
- Python in Hydrology
- Python programming guide for Earth Scientists
- A hands-on introduction to using Python in the Atmospheric and Oceanic sciences
with a preference for the first one.
- Soil Physics with Python: Transport in the Soil-Plant-Atmosphere System, by Bittelli et al, is al, is a book on soil science which is quite appealing (as seen the TOC): the kindle version cost reasonably but it is in Python 2.7. Its Python programs are available here.
Tuesday, February 20, 2018
Costruzioni Idrauliche 2018
This is the material for the 2018 class in Hydraulic constructions at University of Trento. The material, is being revised during the class and is similar to the last year class. A first difference is that slides will be loaded into an Open Science Framework (OSF) repository. More information in the Introductory class below. The name is hydraulic construction. Actually it covers the hydraulic design of a storm water management system and the hydraulic design of an Aqueduct. With hydraulic design, I mean that the class teach how to calculate the hydraulics of infrastructure. It will not teach anything else and neither will cover how to really draw them to produce a final executive project. These knowledges are communicated in the class called "Progetto di Costruzioni Idrauliche".
T - Is a classrom lecture
L - Is a laboratory lecture
Storm water management systems
February
- 2018-02-26
- Introductory Class. (YouTube 2018. Unfortunately non very visible)
- New goals for Urban hydrology. (YouTube 2018. Unfortunately non very visible)
- The sewage systems devices (YouTube 2018. Unfortunately non very visible)
- Local control on the hydrological cycle (YouTube 2018. unfortunately non very visible)
- The law of public works
- The past and the future
- Further readings (optional)
- Using OSF
March
- 2018-03-01 - L - Introduction to Python with Jupyter (We mostly use the three notebook below):
- Una introduzione gentile al Python scripting (mostly a translation from JE lectures
- Leggere file da un file Excel e fare qualche grafico
- Leggere un file con PANDAS (e plot dei dati con Matplotlib)
- Further resources are available in this other post.
- 2018-03-05 - T -
- A little of Statistics and Probability.
- Descriptive Statistics
- Statistics' statistics
- Further readings on Statistics (see also here)
- Probability's Axioms
- Distributions
- Further readings on Probability
- YouTube videos 2018
- 2018-03-08 - L -Explorative data analysis and Simple statistics with Python's Pandas
- Gaussian Distribution
- Central Limit Theorem Illustrated (and some other distributions)
- Gamma Distribution (and some other further Python)
- 2018-03-12 - T - Statistical properties of Extreme precipitations and their interpolation.
- Intensity duration frequency curves
- Gumbel distribution functions
- Moments method
- Maximum likelihood method
- Minimum squares method
- The YouTube video of the class 2018.
- Further readings ( see the Precipitations' post)
- 2018-03-15 - L - Estimation of Extremes with Python
- Exercise: Using the examples of the last Lab, draw a Gumbel distribution.
- Exercise: Select randomly from a Gumbel distribution and empirically shows the effects of the central limit theorem.
- For estimating the Gumbel distribution parameters:
- The YouTube video of the class 2018.
- 2018-03-19 - T - Element for the design of storm water management infrastructures- I.
- The storm water drainage network (SWDN)
- Urban cases
- The estimation of the flood wave (the IUH case) and the Hydraulic design of the SWDN
- YouTube Videos
- 2018-03-22 - L - Estimation of Extremes with Python - II
- 2018-03-26 - T - Elements for the design of storm water infrastructures
April
- 2018-04-5 - L Short introduction to QGIS for representing urban infrastructures.
- Introduction to QGIS by Elisa Stella and Daniele Dalla Torre. For other information about tools installation, see the bottom of the page.
- Using QGIS and GISWater to feed SWMM
- Material/Data of the lab
- Further Video classes on QGIS/GISWATER
- Create subcatchments
- Homogenize the subcatchment layer and insert it in the database
- Geometric information subcatchments
- Average Slope
- Impervious
- Simp, Routeto
- https://www.youtube.com/watch?v=HJIGH-ndcJ4
- 2018-04-09 - T- Pumping stormwaters.
- 2018-04-12 -L Working with SWMM for implementing a storm water management system
- 2018-04-16 - T - Pipes (slides from Roberto Magini) and infrastructures
- 2018-04-19 - L - Simple estimation of the maximum discharge via Python
- Maximum Discharge from a linear reservoir model
- Internal diameter of the pipe
- Summing discharges from pipes (I)
- Summing discharges from pipes (II)
- 2018-04-24 - Midterm written exam
- Question to be answered. Students will be required to answer to two (2) questions randomly chosen and (3) produce e code snippet in Python to perform the task required.
Aqueducts
As a general, simple and descriptive reference, the first six chapters of Maurizio Leopardi's book can be useful :
- Utilizzo Idropotabile
- Il trasporto in pressione
- Dimensionamento idraulico delle condotte
- Acquedotto con sollevamento meccanico
- Serbatoi
- Reti di distribuzione
The state of the water supply in Italy is summarised here (Corriere della Sera, 2018-05-16)
May
- 2018-05-3 L - SWMM and Python problem solving
- 2018-05-7 T - Aqueducts
- Aqueducts in 2020 (YouTube)
- Equations for aqueducts networks (YouTube)
- Mass and energy Conservation (YouTube)
- 2018-05-101 L - Working with SWMM and PYTHON
- 2018-05-14 T - Design of aqueducts networks
- 2018-05-17 L - Introduction to EPANET
- 2018-05-21 T - External aqueducts & Reservoirs
- External aqueducts (YouTube)
- Reservoirs (YouTube)
- Contemporary problems in aqueduct's network management
- 2018-05-24 L - A simple aqueduct configuration with EPANET
- 2018-05-28 - T - Water Sources
- Springs (YouTube)
- Wells (YouTube2018)
- Excavations (optional)
- Surface Waters
- Generalities
- Small Intake (YouTube2018), Grid (Optional)
- Channel (Optional)
- Sand Remover (Optional)
- 2018-05-31 - L - Working with EPANET to design an aqueduct
June
- 2018-06-04 T -
- 2018-06- 11 L - Problem solving with EPANET
During the class I will introduce sever tools for calculations.
- Open Science Framework (OSF): It is a tool for sharing a workflow, especially design for scientific purposes. It allows storage of files and documents and their selective publication on the web.
- Python - Python is a modern programming languages. It will be used for data treatment, estimation of the idf curves of precipitation, some hydraulic calculation and data visualisation. I will use Python mostly as a scripting language to bind and using existing tools.
- SWMM - Is an acronym for Storm Water Management System. Essentially it is a model for the estimation of runoff adjusted to Urban environment. I do not endorse very much its hydrology. However, it is the most used tools by colleagues who cares about storm water management, and I adopt it. It is not a tool for designing storm water networks, and therefore, some more work should be done with Python to fill the gaps.
- EPANET Is the tool developed by EPA to estimate water distribution networks.
- GISWATER: http://growworkinghard.altervista.org/giswater-11-install-windows/
- QGIS: http://growworkinghard.altervista.org/qgis-2-18-how-to-install-step-by-step-on-windows/
Some Examples of presentations on the projects of this class:
Questions for the midterm exam
Domande della prova intermedia 2017.
Grades (voti) of the intermediate exam.
Sunday, February 18, 2018
Conoscere, comunicare, gestire il rischio idrogeologico in ambiente montano: 3/3
On February 15, 2018, we had the second day of the workshop: "Know, communicate and manage the hydrological and geological risk in mountain environment", originally intended for journalists, technicians, as politicians. It was held at the Department of Civil, Environmental and Mechanical Engineering Department of the University of Trento. This page reports the second day.
The first part of the workshop can be found in two previous post:
The first part of the workshop can be found in two previous post:
- Introduction
- About Italian legislation on hazards by Eugenio Caliceti
- Planning Emergencies after natural hazards by Marta Martinengo
- The hazard map of Trentino Province by Mauro Zambotto
- Discussion after dott. Zambotto presentation
- Comments on work groups simulations by Rocco Scolozzi
Conoscere, comunicare, gestire il rischio idrogeologico in Ambiente montano: 2/3
On February 7 and 15, 2018, it was organised a workshop entitled "Know, communicate and manage the hydrological and geological risk in mountain environment", originally intended for journalists, technicians, as politicians. It was held at the Department of Civil, Environmental and Mechanical Engineering Department of the University of Trento. The morning talks are in another post.
The Afternoon was dedicated to a Conference open to the wide public. It covered the topic of which you see the YouTube videos below.
Other two pages refer of the conference:
The Afternoon was dedicated to a Conference open to the wide public. It covered the topic of which you see the YouTube videos below.
Other two pages refer of the conference:
- The Experience of major Ugo Grisenti after the Campolongo event (August 15, 2010)
- Some images from Campolongo
- Information (about hazards) from the institutional channel by Giampaolo Pedrotti
- Andrea Selva on the information on natural hazards (from the point of view of a local newspaper)
- The judge and hazards: the experience of Carlo Ancona
Friday, February 16, 2018
Conoscere, comunicare, gestire il rischio idrogeologico in Ambiente montano 1/3
On Wednesday 7 and Thursday 15 February 2018 we held at our Department (DICAM) a workshop entitled: Know, communicate and manage hydro-geological risks in mountain environments. This workshop was one of the events of the Life FRANCA project. Please find below, in Italian, the YouTube of the talks.
The days were split into other two pages other than this one related to the first morning:
The days were split into other two pages other than this one related to the first morning:
First day (February 7, 2018)
- Introduction to the workshop by Luigi Fraccarollo
- Introduction to Life FRANCA by Rocco Scolozzi
- A review on hydrological hazards for non specialists by Riccardo Rigon
- A little of discussion
- What is "hazard" by Giorgio Rosatti
Monday, January 29, 2018
Grids - Notes for an implementation
This post talks about the same subject already analyzed in a previous post but from a slightly different point of view, hoping to add clarity to the concepts. We assume to already have the grid delineated, as for instance the one in Figure. Some other program or someone else provided to us. All the information is written in a file, maybe in a redundant form, but it is there and we just have to read it.
Assume we are talking about a three-dimensional grid. Nodes, edges, faces, and volume are identified by a number (key, label) which are specified in the grid’s file.
Therefore the problem is to read this file and implement the right (Java) structures/objects to contain it, keeping in mind that our goal, besides to upload the data in memory is to estimate the time marching of a variable $A$ (and, maybe some other variable) in a given volume. Its time variation depends on fluxes of the same quantity (mass, to simplify) that are localised at the face that constitute the boundary of the volume.
Getting the things done
The simplest thing to do is then, to associate a vector whose entries are the values of $A$ for any of the volumes in the grid. Let say, that forgetting any problem which could be connected with execution speed, caching [1], boxing-unboxing of variables, we use a Hashmap to represent these values.
We will use also a Hashmap to contain the fluxes in each face. This hasmap contains $F$ elements: as many as the number of faces. The file from which we started contains all of this information and therefore we do not have any problem to build and fill these “vectors”.
Let’s give a look to what our system to solve can look like. The problem we have to solve changes but, schematically it could be:
For any volume (we omit the index, $i$ of the volume for simplicity):
$$ A^t = A^{t-1} + \sum_l a_l^{t} *i_l^{t}*f_l^t/d_l $$
Where:
$t$ is time (discretized in steps) and $t-i1$ is the previous step;
$l$ is the index of faces belonging to the volume
$d_l$ is the distance between the centroids of the two volumes that share the same face;
$i_l$ is a sign, +1 or -1, which depends on the volume and the face we are considering (volume index omitted);
$a_l$ is the area of the face $l$ or some function of of it.
For generality, the r.h.s. member of the equation is evaluated at time $t$, i.e. the equation is assumed to be implicit, but at a certain moment of the resolution algorithm, the function will be expressed as depending of some previous time (even if from the point of view of internal iterations). For a more detailed case than this simplified scheme, please see, for instance [2].
The Hashmap of $A$ contains the information about the number of volumes, i.e., $V$.
(I) an indication of the faces belonging to each volumeIl vettore (hash map) and
(II) the information about which volumes are adjacent
To obtain this, we have to store information about the topology of our grid. In the previous posts, we tried to investigate and answer to the question: which is the most convenient to store these informations ? (Right, more from a conceptual point of view than from a practical one).
From our previous analysis, we know that that for encoding the number of faces for any volume, we have to introduce a second (2) container that has has many position as the number of volumes, and for any volume a variable number of slots, each for any face of that volume (if the grids is composed by volumes of the same shape, the latter number of slots is constant for the internal elements of the grids, and variable just for the boundary volumes).
In this preliminary analysis, a Hasmap seems appropriate to contain this information, letting, for the moment, unspecified what types or objects contains this topology Hashmap, but eventually, they will contain a key or a number which identifies in a unique way a given face.
In this way the information about any face is present in two slots, belonging to the volumes that share the same face.
We have then the various quantities to store in each face:
Anyone of the above quantites require a container with as many elements as the faces. We could, then, use three Hasmaps, whose indexes (keys) coincide with the numbers (keys) that in the topology Hasmap (2) realate faces to volumes.
To elaborate our equation we need then five containers, of which the topology one has a structure to be specified later. Well, actually all the hashmap internals has to be specified.
The elements of $a$ and $d$ are geometrical quantities that can - and has- to be specified outside the temporal cycle, if the grid structure is not modified during the computation. However, to be estimated they require further topological information that we still do not have (but can be in the grid file).
To estimate faces’ area, we need to know the nodes of the grid [3] which can be a sixth (6) container, and the way they are arranged in the faces, which is a seventh (7) container. Since the choices we did, we still choose to use Hashmaps to contain them. The Hashmap of nodes just contains the number (or the key) of nodes (and is, maybe, in most problems, pleonastic). The Hasmap of faces need to contain the arrangement of nodes, ordered in one of the two direction (left-hand -rule or right-hand-rule, clockwise and counterclockwise depends on the side you observe the face, so what is clockwise from a volume is counterclockwise for the other).
The (7) container has to have as many elements as the faces and each element contains the ordered nodes (a link, a reference, to). To estimate the area of the faces we need actually the geometry of the nodes, meaning their coordinates in some coordinate system. Usually, in most of the approaches, nodes are directly identified by their coordinates, which therefore are inserted directly (in the appropriate way) in container (7) instead that the link/reference to nodes' number (key, label).
However, I think that probably keeping the geometry separated from topology could be useful, because topology has its own scope, for instance in guiding the iteration in the summation that appears in our template equation.
Therefore we need a further container (the eight, 8) for the geometry, containing the coordinates of points. This container has $N$ elements, as many are the nodes.
The container of distances, d, to be filled needs to know between which volumes distances have to be calculated. This information, about volumes adjacency, needs another, further container (the nineth, 9) with length as the faces, i.e. with F elements. Every element, in turn, must contains the index of the elements between which is estimated.
This information that goes into the container 9, should already be in the file from which we are reading all the information. However, we should recover it by scanning all the volumes and finding which have a face in common. The latter, is a calculation that can be made off-line and we can, in any case consider it an acquired.
At this stage, we do not have much information about $f_l$. Certainly it will need to know which are adjacent volumes and requires the knowledge in container (9). Because $f_l$ is time varying it implies that information in (9) has to be maintained all along the simulation.
Every other information will require a further container. To sum up, we have a container of:
Towards generalizations that look to information hiding and encapsulation
We can observe that we have three types of containers: the ones which contain topological information (2,6,7,9), those which contain physical quantities (1,4), those which contains geometric quantities (3,5,8).
If, instead than a 3D problems, we would have a 2D or 1D one, the number of container change, but not their types.
To go further deep, the first problem to deal with could be to understand how, in the topology container, for instance of volumes (2) how to make room for the slots indicating their faces, since they are of variable dimension. In traditional programming, usually they would have adopted a “brute force” approach: each slot would have been set to have the dimension of the larger number of elements to be contained. The empty element replaced by a conventional number to be check. Essentially all of it would have resulted in a matrix whose rows (columns) would correspond to the the number of elements (volumes, faces) and whose columns (rows) to the variable number of elements they contain (in the case of volumes, faces; in the case of faces, edges, and so on).
In a OO language, like Java, the sub-containers of variable dimensions can be appropriate object, for instance called generically “cell” containing an array of int[ ]. Therefore the global container of a topology could be a hashmap of cells.
In principle we could use the container defined above without any wrapper, directly defining them in term of standard objects in the library of Java 9.
However, we would like, maybe, to use other types eith resepcts to those we defined. For instance, in some cases, for speed reasons, we could substitute ArrayList to Hasmap or, someone of us, working on the complexity of caching could come out with some more exotic objects.
To respond to these cases, we would like then to introduce some abstraction which, without penalizing (too much) performances. Sure, we can define wrapper classes, for instance:
Assume we are talking about a three-dimensional grid. Nodes, edges, faces, and volume are identified by a number (key, label) which are specified in the grid’s file.
Therefore the problem is to read this file and implement the right (Java) structures/objects to contain it, keeping in mind that our goal, besides to upload the data in memory is to estimate the time marching of a variable $A$ (and, maybe some other variable) in a given volume. Its time variation depends on fluxes of the same quantity (mass, to simplify) that are localised at the face that constitute the boundary of the volume.
Getting the things done
The simplest thing to do is then, to associate a vector whose entries are the values of $A$ for any of the volumes in the grid. Let say, that forgetting any problem which could be connected with execution speed, caching [1], boxing-unboxing of variables, we use a Hashmap to represent these values.
We will use also a Hashmap to contain the fluxes in each face. This hasmap contains $F$ elements: as many as the number of faces. The file from which we started contains all of this information and therefore we do not have any problem to build and fill these “vectors”.
Let’s give a look to what our system to solve can look like. The problem we have to solve changes but, schematically it could be:
For any volume (we omit the index, $i$ of the volume for simplicity):
$$ A^t = A^{t-1} + \sum_l a_l^{t} *i_l^{t}*f_l^t/d_l $$
Where:
$t$ is time (discretized in steps) and $t-i1$ is the previous step;
$l$ is the index of faces belonging to the volume
$d_l$ is the distance between the centroids of the two volumes that share the same face;
$i_l$ is a sign, +1 or -1, which depends on the volume and the face we are considering (volume index omitted);
$a_l$ is the area of the face $l$ or some function of of it.
For generality, the r.h.s. member of the equation is evaluated at time $t$, i.e. the equation is assumed to be implicit, but at a certain moment of the resolution algorithm, the function will be expressed as depending of some previous time (even if from the point of view of internal iterations). For a more detailed case than this simplified scheme, please see, for instance [2].
The Hashmap of $A$ contains the information about the number of volumes, i.e., $V$.
(I) an indication of the faces belonging to each volumeIl vettore (hash map) and
(II) the information about which volumes are adjacent
To obtain this, we have to store information about the topology of our grid. In the previous posts, we tried to investigate and answer to the question: which is the most convenient to store these informations ? (Right, more from a conceptual point of view than from a practical one).
From our previous analysis, we know that that for encoding the number of faces for any volume, we have to introduce a second (2) container that has has many position as the number of volumes, and for any volume a variable number of slots, each for any face of that volume (if the grids is composed by volumes of the same shape, the latter number of slots is constant for the internal elements of the grids, and variable just for the boundary volumes).
In this preliminary analysis, a Hasmap seems appropriate to contain this information, letting, for the moment, unspecified what types or objects contains this topology Hashmap, but eventually, they will contain a key or a number which identifies in a unique way a given face.
In this way the information about any face is present in two slots, belonging to the volumes that share the same face.
We have then the various quantities to store in each face:
- $a_l$ (3)
- $f_l$ (4)
- $d_l$ (5)
Anyone of the above quantites require a container with as many elements as the faces. We could, then, use three Hasmaps, whose indexes (keys) coincide with the numbers (keys) that in the topology Hasmap (2) realate faces to volumes.
To elaborate our equation we need then five containers, of which the topology one has a structure to be specified later. Well, actually all the hashmap internals has to be specified.
The elements of $a$ and $d$ are geometrical quantities that can - and has- to be specified outside the temporal cycle, if the grid structure is not modified during the computation. However, to be estimated they require further topological information that we still do not have (but can be in the grid file).
To estimate faces’ area, we need to know the nodes of the grid [3] which can be a sixth (6) container, and the way they are arranged in the faces, which is a seventh (7) container. Since the choices we did, we still choose to use Hashmaps to contain them. The Hashmap of nodes just contains the number (or the key) of nodes (and is, maybe, in most problems, pleonastic). The Hasmap of faces need to contain the arrangement of nodes, ordered in one of the two direction (left-hand -rule or right-hand-rule, clockwise and counterclockwise depends on the side you observe the face, so what is clockwise from a volume is counterclockwise for the other).
The (7) container has to have as many elements as the faces and each element contains the ordered nodes (a link, a reference, to). To estimate the area of the faces we need actually the geometry of the nodes, meaning their coordinates in some coordinate system. Usually, in most of the approaches, nodes are directly identified by their coordinates, which therefore are inserted directly (in the appropriate way) in container (7) instead that the link/reference to nodes' number (key, label).
However, I think that probably keeping the geometry separated from topology could be useful, because topology has its own scope, for instance in guiding the iteration in the summation that appears in our template equation.
Therefore we need a further container (the eight, 8) for the geometry, containing the coordinates of points. This container has $N$ elements, as many are the nodes.
The container of distances, d, to be filled needs to know between which volumes distances have to be calculated. This information, about volumes adjacency, needs another, further container (the nineth, 9) with length as the faces, i.e. with F elements. Every element, in turn, must contains the index of the elements between which is estimated.
This information that goes into the container 9, should already be in the file from which we are reading all the information. However, we should recover it by scanning all the volumes and finding which have a face in common. The latter, is a calculation that can be made off-line and we can, in any case consider it an acquired.
At this stage, we do not have much information about $f_l$. Certainly it will need to know which are adjacent volumes and requires the knowledge in container (9). Because $f_l$ is time varying it implies that information in (9) has to be maintained all along the simulation.
Every other information will require a further container. To sum up, we have a container of:
- quantity A;
- topology of Volumes;
- the area of faces;
- fluxes;
- distances between volumes’ centroids;
- nodes number (label, key)
- nodes that belong to a face
- coordinates of nodes
- topology of faces (referring to the volumes they separate)
Towards generalizations that look to information hiding and encapsulation
We can observe that we have three types of containers: the ones which contain topological information (2,6,7,9), those which contain physical quantities (1,4), those which contains geometric quantities (3,5,8).
If, instead than a 3D problems, we would have a 2D or 1D one, the number of container change, but not their types.
To go further deep, the first problem to deal with could be to understand how, in the topology container, for instance of volumes (2) how to make room for the slots indicating their faces, since they are of variable dimension. In traditional programming, usually they would have adopted a “brute force” approach: each slot would have been set to have the dimension of the larger number of elements to be contained. The empty element replaced by a conventional number to be check. Essentially all of it would have resulted in a matrix whose rows (columns) would correspond to the the number of elements (volumes, faces) and whose columns (rows) to the variable number of elements they contain (in the case of volumes, faces; in the case of faces, edges, and so on).
In a OO language, like Java, the sub-containers of variable dimensions can be appropriate object, for instance called generically “cell” containing an array of int[ ]. Therefore the global container of a topology could be a hashmap of cells.
In principle we could use the container defined above without any wrapper, directly defining them in term of standard objects in the library of Java 9.
However, we would like, maybe, to use other types eith resepcts to those we defined. For instance, in some cases, for speed reasons, we could substitute ArrayList to Hasmap or, someone of us, working on the complexity of caching could come out with some more exotic objects.
To respond to these cases, we would like then to introduce some abstraction which, without penalizing (too much) performances. Sure, we can define wrapper classes, for instance:
- for topologies (essentially used to drive iterations)
- for geometries
- for physical quantities (used to contains data, immutable for parameters, and time-varying for variables)
These three classes would allow to fit all the cases for any dimension (1D, 2D, 3D): just the number of topology element would be varying.
However, this strategy could not be open enough to extensions which do not require breaking the code (be closed to modifications).
Using instead of classes, interfaces or abstract classes could be the right solution.
Classes, or BTW, interfaces could have also the added value to contain enough field to specify the characteristics of the entities, (es. if they work in 2D or 3D, their “name”, their units, all those type of information requested by the Ugrid convention). All these types of information are, obviously, also useful to make the programs or the libraries we are going to implement more readable and easier to be inspected by an external user.
While the topology class is self-explanatory, the geometry class (interface) has a connection to its topology. Therefore the geometry class should contain a reference to its topology to make explicit its dependence. A quantity object, for the same reason, should contain a reference to both its topology and its geometry.
The simplicity to use classes directly could be tantalizing, however, the investment for generality made by interposing interfaces or abstract classes is an investment for future.
Berti [3] advise, in fact, to separate the algorithm from the data structure, allowing therefore to write a specific algorithm once forever, and changing the data it uses, as we like. This would be a ideal condition maybe impossible to gain, but working to maintain in any case the possible changes in limited parts of the codebase is an add value to keep as reference. That is why “encapsulation” is one of paradigms of OO programming.
Some final notes
1 - In using cw-complexes to manage topology there could be overhead for speed. For instance, for accessing the values in a face of a volume, vi have to
access the volume,
access the address of the face
redirect to the appropriate quantity container to access the value
It could be useful then to eliminate one phase and once accessing the volume, having directly associate to it not the address of of the faces but the values contained in it.
If we have more than one value for face to access, related to different quantities and parameters, than maybe this added computational overhead could be considered negligible with respect to the simplicity of management of many quantities. In any case, an alternative to test.
2- At any time step, it is not only requested the quantity at time $t$, $A^{t}$, but also at the previous time, $t-1$, $A^{t-1}$. The two data structures share the same topology (which could represent a memory save). During time marching an obvious attention that the programmer needs to have is not to allocate a new grid to any time step. We can limit ourself to use only two grids across the simulation.
As an example, let us assume that time $t-1$ is going to be contained in vector $A^1$ and time $t$ in $a^2$. Then the above requirement could be obtained by switching the two matrixes as schematized as follows:
3 - At the core of the method os solution of the equation under scrutiny, there could usually be a Newton method, e.g. [3], Appendix A, equation A8. Any efficiency improvement for the solver is then reduce to improve the speed of this core, that, eventually can be parallelised.
Bibliography
[1] - Lund, E. T. (2014). Implementing High-Performance Delaunay Triangolation in Java. Master Thesis (A. Maus, Ed.).
[2] - Cordano, E., and R. Rigon (2013), A mass-conservative method for the integration of the two-dimensional groundwater (Boussinesq) equation, Water Resour. Res., 49, doi:10.1002/wrcr.20072.
[3] - O'Rourke, J., Computational geometry in C, Cambridge University Press, 2007
[4] - Berti, G. (2000, May 25). Generic Software Components for Scientific Computing. Ph.D. Thesis
However, this strategy could not be open enough to extensions which do not require breaking the code (be closed to modifications).
Using instead of classes, interfaces or abstract classes could be the right solution.
Classes, or BTW, interfaces could have also the added value to contain enough field to specify the characteristics of the entities, (es. if they work in 2D or 3D, their “name”, their units, all those type of information requested by the Ugrid convention). All these types of information are, obviously, also useful to make the programs or the libraries we are going to implement more readable and easier to be inspected by an external user.
While the topology class is self-explanatory, the geometry class (interface) has a connection to its topology. Therefore the geometry class should contain a reference to its topology to make explicit its dependence. A quantity object, for the same reason, should contain a reference to both its topology and its geometry.
The simplicity to use classes directly could be tantalizing, however, the investment for generality made by interposing interfaces or abstract classes is an investment for future.
Berti [3] advise, in fact, to separate the algorithm from the data structure, allowing therefore to write a specific algorithm once forever, and changing the data it uses, as we like. This would be a ideal condition maybe impossible to gain, but working to maintain in any case the possible changes in limited parts of the codebase is an add value to keep as reference. That is why “encapsulation” is one of paradigms of OO programming.
Some final notes
1 - In using cw-complexes to manage topology there could be overhead for speed. For instance, for accessing the values in a face of a volume, vi have to
access the volume,
access the address of the face
redirect to the appropriate quantity container to access the value
It could be useful then to eliminate one phase and once accessing the volume, having directly associate to it not the address of of the faces but the values contained in it.
If we have more than one value for face to access, related to different quantities and parameters, than maybe this added computational overhead could be considered negligible with respect to the simplicity of management of many quantities. In any case, an alternative to test.
2- At any time step, it is not only requested the quantity at time $t$, $A^{t}$, but also at the previous time, $t-1$, $A^{t-1}$. The two data structures share the same topology (which could represent a memory save). During time marching an obvious attention that the programmer needs to have is not to allocate a new grid to any time step. We can limit ourself to use only two grids across the simulation.
As an example, let us assume that time $t-1$ is going to be contained in vector $A^1$ and time $t$ in $a^2$. Then the above requirement could be obtained by switching the two matrixes as schematized as follows:
- Create A1and A2,
- Set A1 to initial conditions
- For any t
- A2=f(A1)
- cwComplex.switch(A1,A2)
- cwComplex.switch(A1,A2)
- B = A1;
- A1=A2;
- A2=B;
3 - At the core of the method os solution of the equation under scrutiny, there could usually be a Newton method, e.g. [3], Appendix A, equation A8. Any efficiency improvement for the solver is then reduce to improve the speed of this core, that, eventually can be parallelised.
Bibliography
[1] - Lund, E. T. (2014). Implementing High-Performance Delaunay Triangolation in Java. Master Thesis (A. Maus, Ed.).
[2] - Cordano, E., and R. Rigon (2013), A mass-conservative method for the integration of the two-dimensional groundwater (Boussinesq) equation, Water Resour. Res., 49, doi:10.1002/wrcr.20072.
[3] - O'Rourke, J., Computational geometry in C, Cambridge University Press, 2007
[4] - Berti, G. (2000, May 25). Generic Software Components for Scientific Computing. Ph.D. Thesis
Subscribe to:
Posts (Atom)