Data-driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Data-driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression PDF full book. Access full book title Data-driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression by Arthur Grundner. Download full books in PDF and EPUB format.

Data-driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression

Data-driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression PDF Author: Arthur Grundner
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Book Description
This thesis delves into the improvement of cloud parameterizations in climate models through machine learning trained on coarse-grained output from high-resolution simulations. Utilizing the ICOsahedral Non-hydrostatic (ICON) modeling framework, it specifically targets the enhancement of cloud cover parameterization within the ICON Earth System Model. Three types of neural networks (NNs) differing in vertical locality are developed to estimate cloud cover, with globally trained NNs even applicable to distinct regional simulations. Interpretability analysis exposes model-specific biases and local relationships with the thermodynamic environment. Despite achieving high predictive performance, NNs necessitate post-hoc interpretation tools. To tackle this issue, a combined hierarchical modeling framework incorporating symbolic regression, feature selection, and physical constraints is proposed. The resulting equations, characterized by simplicity and physical consistency, attain performance comparable to NNs while demonstrating superior transferability to other realistic datasets. Our best equation adeptly captures cloud cover distributions across various regimes, notably excelling in representing marine stratocumulus clouds by learning to utilize the vertical relative humidity gradient. This research underscores the potential of deep learning in achieving accurate cloud parameterizations and emphasizes the effective role of symbolic regression in deriving interpretable, consistent equations for cloud cover.

Data-driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression

Data-driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression PDF Author: Arthur Grundner
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Book Description
This thesis delves into the improvement of cloud parameterizations in climate models through machine learning trained on coarse-grained output from high-resolution simulations. Utilizing the ICOsahedral Non-hydrostatic (ICON) modeling framework, it specifically targets the enhancement of cloud cover parameterization within the ICON Earth System Model. Three types of neural networks (NNs) differing in vertical locality are developed to estimate cloud cover, with globally trained NNs even applicable to distinct regional simulations. Interpretability analysis exposes model-specific biases and local relationships with the thermodynamic environment. Despite achieving high predictive performance, NNs necessitate post-hoc interpretation tools. To tackle this issue, a combined hierarchical modeling framework incorporating symbolic regression, feature selection, and physical constraints is proposed. The resulting equations, characterized by simplicity and physical consistency, attain performance comparable to NNs while demonstrating superior transferability to other realistic datasets. Our best equation adeptly captures cloud cover distributions across various regimes, notably excelling in representing marine stratocumulus clouds by learning to utilize the vertical relative humidity gradient. This research underscores the potential of deep learning in achieving accurate cloud parameterizations and emphasizes the effective role of symbolic regression in deriving interpretable, consistent equations for cloud cover.

Data Assimilation for the Earth System

Data Assimilation for the Earth System PDF Author: Richard Swinbank
Publisher: Springer Science & Business Media
ISBN: 9781402015922
Category : Mathematics
Languages : en
Pages : 398

Book Description
Data assimilation is the combination of information from observations and models of a particular physical system in order to get the best possible estimate of the state of that system. The technique has wide applications across a range of earth sciences, a major application being the production of operational weather forecasts. Others include oceanography, atmospheric chemistry, climate studies, and hydrology. Data Assimilation for the Earth System is a comprehensive survey of both the theory of data assimilation and its application in a range of earth system sciences. Data assimilation is a key technique in the analysis of remote sensing observations and is thus particularly useful for those analysing the wealth of measurements from recent research satellites. This book is suitable for postgraduate students and those working on the application of data assimilation in meteorology, oceanography and other earth sciences.

Machine Learning and Data Mining Approaches to Climate Science

Machine Learning and Data Mining Approaches to Climate Science PDF Author: Valliappa Lakshmanan
Publisher: Springer
ISBN: 3319172204
Category : Science
Languages : en
Pages : 243

Book Description
This book presents innovative work in Climate Informatics, a new field that reflects the application of data mining methods to climate science, and shows where this new and fast growing field is headed. Given its interdisciplinary nature, Climate Informatics offers insights, tools and methods that are increasingly needed in order to understand the climate system, an aspect which in turn has become crucial because of the threat of climate change. There has been a veritable explosion in the amount of data produced by satellites, environmental sensors and climate models that monitor, measure and forecast the earth system. In order to meaningfully pursue knowledge discovery on the basis of such voluminous and diverse datasets, it is necessary to apply machine learning methods, and Climate Informatics lies at the intersection of machine learning and climate science. This book grew out of the fourth workshop on Climate Informatics held in Boulder, Colorado in Sep. 2014.

Improved Earth System Prediction Using Large Ensembles and Machine Learning

Improved Earth System Prediction Using Large Ensembles and Machine Learning PDF Author: William Chapman
Publisher:
ISBN:
Category :
Languages : en
Pages : 272

Book Description
The purpose of this thesis is to examine and advance North American weather predictability from weather to subseasonal time-scales. Specifically, it focuses on 1) developing machine learning/deep learning methods and models to improve predictability through numerical weather prediction (NWP) post-processing on weather time-scales (0-7 days) and 2) examining the physical mechanisms which govern the evolution of the predictable components and noise components of teleconnection modes on subseasonal time-scales (7 days-1 month). NWP deficiencies (e.g., sub-grid parameterization approximations), nonlinear error growth associated with the chaotic nature of the atmosphere, and initial condition uncertainty lead initial small forecast errors to eventually result in weather predictions which are as skillful as random forecasts. A portion of these forecast errors are inherent to the NWP models alone, systematic biases. The first two chapters develop cutting-edge vision-based deep-learning algorithms to advance the current state-of-the-art NWP post-processing and correct these systematic biases. Using dynamic forecasts of North Pacific integrated vapor transport (IVT) as a test case, we develop post-processing systems which are spatially aware, readily encode non-linear predictor interaction, easily ingest ancillary weather variables, and have state of the art training methods that systematically prevent model overfitting. Further, we outline a framework to quantify uncertainty in single-point (deterministic) forecasts using neural networks. The uncertainty is shown to be probabilistically rigorous, leading to calibrated probabilistic forecasts which outperform or compete with calibrated dynamic NWP ensemble systems for IVT under atmospheric river conditions. The second half of this thesis shifts focus to subseasonal time scales and examines predictability in the Pacific North American (PNA) sector in boreal winter. Particularly, it investigates the physical mechanisms involved in the intraseasonal modulation of atmospheric Signal-to-Noise (SN), and how it is affected by slowly varying climate modes (ENSO and MJO). These mechanisms are further explored using a fully-coupled hindcast of the 20th century, showing that the increased SN leads to high model forecast skill at subseasonal timescales in particular forecast windows of opportunity. Additionally, we reveal the MJO as the largest growing mode of tropical forecast uncertainty which directly influences PNA forecast certainty.

Deep-learning-based Geological Parameterizations for History Matching Complex Geomodels

Deep-learning-based Geological Parameterizations for History Matching Complex Geomodels PDF Author: Jimmy Liu
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Geological parameterization enables the representation of geomodels in terms of a relatively small set of uncorrelated variables. Parameterization is therefore very useful in the context of history matching (data assimilation) and uncertainty quantification. Traditional geological parameterizations, however, may not be effective at preserving complex geological structures, such as fluvial channels, levees and deltaic fans. To address this issue, in this work we develop a deep-learning-based geological parameterization, referred to as CNN-PCA. The CNN-PCA procedure combines principal component analysis (PCA), a traditional parameterization technique, with deep learning. The main idea is to train a deep convolutional neural network (CNN), referred to as the model transform net, as a post-processor for models parameterized with PCA, to recover geological realism. The training loss for the model transform net involves a set of geomodel features (specifically Gram matrices) extracted from another pretrained CNN, referred to as the loss net. For more challenging 3D systems, a supervised-learning-based loss term is introduced. Hard data loss is included in both the 2D and 3D CNN-PCA formulations. The CNN-PCA procedure is first developed and applied for 2D geological systems. These include a binary fluvial channel system and a bimodal deltaic fan system. In both cases, CNN-PCA is shown to provide realizations that honor the geological features present in reference models generated using geomodelling software. Quantitative assessments are conducted for the binary channel system. For this case, connectivity measures and two-phase flow statistics obtained with random (test-set) CNN-PCA geomodels closely match results for reference models. A strategy for the formal selection of the various training weighting factors is developed based on the connectivity measures. History matching results for the binary channel system are presented. In this assessment CNN-PCA is applied with derivative-free optimization, and a subspace randomized maximum likelihood (RML) method is used to provide multiple posterior models. Data assimilation and significant uncertainty reduction are achieved for existing wells, and physically reasonable predictions are also obtained for new wells. We next describe a multilevel CNN-PCA-based history matching procedure, again using RML to generate posterior models. Mesh adaptive direct search (MADS), a pattern-search method that parallelizes naturally, is applied for optimization. Although the use of CNN-PCA parameterization reduces the number of variables that must be determined during history matching, the minimization problem can still be computationally demanding. The multilevel strategy addresses this issue by reducing the number of simulations that must be performed at each MADS iteration. Specifically, the PCA coefficients (which are the optimization variables after CNN-PCA parameterization) are determined in groups, at multiple levels, rather than all at once. History matching results are presented for a 2D binary channelized system and a 2D bimodal deltaic fan system. These computations demonstrate that substantial uncertainty reduction is achieved in both cases, that multilevel results are in essential agreement with reference single-level results, and that the multilevel strategy acts to substantially reduce the total number of flow simulations required. Finally, we extend the CNN-PCA procedure to handle complex 3D geological systems. The training loss for the 3D model transform net involves features of geomodels extracted from a pretrained 3D CNN, a new supervised-learning-based reconstruction loss, and a hard data loss. The 3D CNN-PCA algorithm is applied for the generation of conditional 3D realizations, defined on grids containing $60\times60\times40$ cells, for three geological scenarios (binary and bimodal channelized systems, and a three-facies channel-levee-mud system). CNN-PCA realizations are shown to exhibit geological features that are visually consistent with reference models generated using object-based methods. Statistics of two-phase flow responses for test sets of 3D CNN-PCA models are shown to be in consistent agreement with those from reference geomodels. The 3D CNN-PCA parameterization is then applied for history matching using an ensemble smoother with multiple data assimilation. Results for the bimodal channelized system demonstrate that 3D CNN-PCA is very effective in this setting.

Improved Representations of Cloud-Scale Processes in Meteorological Forecast Models

Improved Representations of Cloud-Scale Processes in Meteorological Forecast Models PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 12

Book Description
The functional relationship between cloud cover and relative humidity (Rh) averaged over areas comparable to grid dimensions of numerical weather models was quantified using RTNEPH and 3DNEFH observations. Cloud cover in any atmospheric level decreases exponentially as layer averaged Rh tails below 100%, and no observations support critical Rhs below which cloud cover is zero. Small cloud amounts occur at all Rhs. Therefore, current weather models probably underestimate cloud coverage, especially at Rhs below the critical humidities used by most models. At the same Rh, convection enhances cloud coverage in the upper troposphere. and decreases cloud coverage in the lower troposphere. Developed a simplified and innovative mass flux convective parameterization that was evaluated using atmospheric radon profiles, and was also used to simulate the redistribution of heat and moisture by combining the approach of stochastic mixing with detraining plumes. A public domain cloud resolving model (ARPS) was used to further refine the 1-D parameterization. Both the cloud resolving models and the convective parameterization were evaluated using GATE observations. However the ARPS model employs an advection algorithm that does not conserve water mass, making it unreliable to use for refining cloud parameterizations.

Deep-learning-based Surrogate Modeling of Flow and Coupled Flow-geomechanics for Data Assimilation in Subsurface Systems

Deep-learning-based Surrogate Modeling of Flow and Coupled Flow-geomechanics for Data Assimilation in Subsurface Systems PDF Author: Meng Tang
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Data assimilation in subsurface systems is challenging due to the large number of flow simulations often required, and by the need to preserve geological realism in the calibrated (posterior) models. In this work we present a deep-learning-based surrogate-modeling framework, referred to as recurrent R-U-Net, for flow or coupled flow and geomechanics in subsurface formations. The recurrent R-U-Net consists of convolutional and recurrent (convLSTM) neural networks, designed to capture the spatial-temporal information associated with subsurface flow dynamics. The recurrent R-U-Net is trained on the simulated state fields for a set of O(1000) random geomodels. After training, the surrogate model provides very fast predictions of dynamic states (pressure, saturation and displacement when geomechanics is considered) and well responses for new geological realizations. The recurrent R-U-Net surrogate model is used for history matching in conjunction with CNN-PCA (convolutional neural network - principal component analysis), a recently developed deep-learning-based geological parameterization procedure. CNN-PCA preserves complex geological features by post-processing the PCA representation through use of a transform net. The recurrent R-U-Net surrogate model is first developed for 2D oil-water systems. Training samples comprise global pressure and saturation maps, at 10 time steps, generated by performing high-fidelity flow simulation for 1500 channelized geomodels defined on 80x80 grids. The training time for each of the (pressure and saturation) 2D recurrent R-U-Nets is 80 minutes on a Nvidia Tesla V100 GPU. After training, the recurrent R-U-Net provides predictions in 0.01 seconds, which represents a speedup of a factor of 1000 relative to high-fidelity simulation. The recurrent R-U-Net surrogate model is shown to be capable of predicting accurate dynamic 2D pressure and saturation fields and well rates for new geological realizations consistent with those used for training. Assessments demonstrating high surrogate-model accuracy are presented for an individual geological realization and for an ensemble of 500 test geomodels. The surrogate model is then used for history matching in a 2D channelized system. An optimization-based history matching procedure, randomized maximum likelihood with mesh adaptive direct search (MADS-RML), is applied. The overall approach provides substantial reduction in prediction uncertainty. High-fidelity numerical simulation results for the posterior geomodels (generated by the surrogate-based data assimilation procedure in combination with CNN-PCA) are shown to be in essential agreement with the recurrent R-U-Net predictions. Next, the 2D recurrent R-U-Net surrogate model is extended to handle 3D systems. This requires, in addition to the low-level implementation of some 3D network modules, the development of a 3D multiblock well model. The 3D recurrent R-U-Net is trained on 3D saturation and pressure fields for a set of 2500 channelized geomodels defined on 80x80x20 grids (a total of 128,000 cells). About 15 hours of training time are required for each of the 3D recurrent R-U-Nets on a Nvidia Tesla V100 GPU. The trained surrogate provides a single prediction in less than 0.04 seconds, while the high-fidelity simulations require about 7 minutes. Detailed flow predictions demonstrate that this recurrent R-U-Net surrogate model again provides accurate results for dynamic states and well responses for new geological realizations. Three different history matching procedures are assessed, with the 3D recurrent R-U-Net used for flow prediction and CNN-PCA applied for geomodel parameterization. The three methods are rejection sampling (RS), MADS-RML, and ensemble smoother with multiple data assimilation (ES-MDA). RS results provide the reference against which MADS-RML and ES-MDA posterior predictions are evaluated. We find that both MADS-RML and ES-MDA provide history matching results in general agreement with those from RS. MADS-RML is more accurate, however, and ES-MDA can display significant error in some quantities. Assessments of ES-MDA sensitivity to data error and the number of assimilation steps are also performed. Finally, we extend the 3D recurrent R-U-Net framework to treat coupled flow and geomechanics in CO2 storage settings. The problem domain for the high-fidelity solution includes the storage aquifer, a large surrounding region, overburden (which extends up to the Earth's surface), and bedrock. The full model is defined on a 60x60x37 grid, while the storage aquifer is represented by 40x40x12 blocks. The multi-Gaussian porosity and permeability fields in the storage aquifer are considered to be uncertain. Results from a set of 2000 high-fidelity full-order simulations provide the training data. An advantage of the 3D recurrent R-U-Net is that it can be trained to predict only the quantities of interest in particular domains. Here it is trained to provide pressure and saturation in the storage aquifer and vertical displacement at the Earth's surface. Five hours of training time are required for each of the three networks (corresponding to the three state variables) needed for this case. A single high-fidelity simulation takes about 0.8 hours of parallel computation (on 32 cores), while the trained surrogate provides predictions in less than 0.01 seconds. The storage aquifer states and the 2D surface displacement maps provided by the surrogate model display a high degree of accuracy, for both individual realizations and ensemble statistics. The 3D recurrent R-U-Net surrogate model is then applied with a rejection sampling procedure for history matching. The observations consist of a small number of surface displacement measurements. Significant uncertainty reduction in surface displacement and pressure buildup at the caprock is achieved. The history matching computations performed for this example would not be feasible using high-fidelity simulation.

Earth System Modelling: ESM data archives in the times of the grid

Earth System Modelling: ESM data archives in the times of the grid PDF Author: Luca Bonaventura
Publisher:
ISBN:
Category : Climatic changes
Languages : en
Pages :

Book Description


Artificial Intelligence Methods in the Environmental Sciences

Artificial Intelligence Methods in the Environmental Sciences PDF Author: Sue Ellen Haupt
Publisher: Springer Science & Business Media
ISBN: 1402091192
Category : Science
Languages : en
Pages : 418

Book Description
How can environmental scientists and engineers use the increasing amount of available data to enhance our understanding of planet Earth, its systems and processes? This book describes various potential approaches based on artificial intelligence (AI) techniques, including neural networks, decision trees, genetic algorithms and fuzzy logic. Part I contains a series of tutorials describing the methods and the important considerations in applying them. In Part II, many practical examples illustrate the power of these techniques on actual environmental problems. International experts bring to life ways to apply AI to problems in the environmental sciences. While one culture entwines ideas with a thread, another links them with a red line. Thus, a “red thread“ ties the book together, weaving a tapestry that pictures the ‘natural’ data-driven AI methods in the light of the more traditional modeling techniques, and demonstrating the power of these data-based methods.

Data Science on the Google Cloud Platform

Data Science on the Google Cloud Platform PDF Author: Valliappa Lakshmanan
Publisher: "O'Reilly Media, Inc."
ISBN: 1491974532
Category : Computers
Languages : en
Pages : 403

Book Description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches. Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science. You’ll learn how to: Automate and schedule data ingest, using an App Engine application Create and populate a dashboard in Google Data Studio Build a real-time analysis pipeline to carry out streaming analytics Conduct interactive data exploration with Google BigQuery Create a Bayesian model on a Cloud Dataproc cluster Build a logistic regression machine-learning model with Spark Compute time-aggregate features with a Cloud Dataflow pipeline Create a high-performing prediction model with TensorFlow Use your deployed model as a microservice you can access from both batch and real-time pipelines