Wednesday, February 16, 2022

DARTHs programming cheat sheet

We recently wrote about DARTHs.  Here it is a summary of those more extensive documents. DARTHs fall under the category of thematic Digital Twins.

DARTHs are composable infrastructures whose parts are loosely connected, communicate with standard protocols and can easily be substituted. There is not such a thing like a DARTH but DARTH solutions. They constitute an ecosystem of tools whose parts can be separated and recomposed in new solutions, exactly as it happens for Linux distributions (distros).  Distros need usually an integrator  of resources which is a further component of DARTHs. Below we subdivide the main requirements necessary to make the DARTHs effectively working. 


Data
  • Must be Open
  • Provided on demand over the cloud
  • Discoverable on the web
  • Provided in standardized self-explanatory formats
  • Retrievable on the base on open APIs on the most common computer languages
DARTHs IT Architecture requirements

DARTHs serve different users and roles
Therefore they are agnostic:
  •  with respect to the science
  • the programming language
  • The operative system
  • Modelling styles and paradigm
  • They are lightweight with respect to programming habits, meaning they should be minimalist in adding programming rules and aims to maintain the code short
  • Variables names should be mapped into standard names that provide an unique identification

in Open DARTHs,
  • Programs must be open source and developable with open tools
  • Simulations and operations must be traceable and replicable by construction
  • Chunks of data, codes and modelling solutions should be organized in standard ways and be deployable over the web for third parties testing and reuse

Hydrological models for DARTHs

  • need to be able to easy any flavor of modelling effort through like can be models based on ODEs (systems of ordinary differential equations), PDEs (system of partial differential equations), MS (Statistical Modelling), ML (Machine Learning) and any other tool that science will invent.
  • should be deployed in reusable and interoperable components (A component model is a model that is used to encapsulate equations or specific tasks into a reusable form. Components can be connected at runtime and communicate exchanging data in RAM memory)
  • reusable components should obey to standard rules for inputs and outputs appropriate identification
  • need to be deployable along the web on all the available platforms.
  • model-to-model (components) communication should be allowed through API

DARTHs programming practices

DARTHs must
  • Adopt software quality testing for any component separately
  • Adopting good programming practices
  • Have literate computing tools 
  • Promet clean code and literate programming

DARTHs and Computing facilities
  • They use various form  of parallelism but overall, parallelism should be provided as a service without having to change programming habits of models developers.
  • Workflows of tasks that can be schematize on graphs should be internally parallelized using piping methods.
  • Models on grids should use middleware that separates the parallelization from the "physics" of implementation and should be as less invasive as possible.
  • They can use multicores, multiprocessor, high end computers, distributed computing without necessarily direct intervention of the programmer
  • They should use the natural partition of Earth surface in river catchments and subcatchments to split the computational work.

DARTHs and EO
  • The complex retrieval chain from observed quantities to hydrological quantities has to become explicit and integrated in modeling
  • Data Assimilation of any type must become part of the modelling chain

DARTHs and Science Reliability and practices

  • DARTH must make very smooth hypothesis testing (in the sense of allowing alternative modelling structures)
  • The quantification of uncertainty must become  inherently part of an open DARTH. 
  • DARTHs must support open science practices by construction
Overall
  • To accomplish all the above task DARTHs requires some appropriate, lightweight, non invasive  infrastructure (framework) 
    • that supports the features required
    • allows the development the use and the continuous evolving of the DARTHs
    • separates the programming  and the use of the tools from the connection to data resources (EO, IoT, more traditional datasets),  
    • connects to web services, 
    • provides parallelism of computation and 
    • accesses to High Performance Computing or  HPC cloud services, 
    • allows communication between components, 
    • and components dispatching over the web, 
    • manages the several security issues, 
just to mention a few. 

No comments:

Post a Comment