Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF full book. Access full book title Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access by . Download full books in PDF and EPUB format.

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
One forthcoming challenge in the area of high-performance computing is having the ability to run large-scale problems while coping with less memory per compute node. In this work, they investigate a novel data decomposition method that would allow Monte Carlo transport calculations to be performed on systems with limited memory per compute node. In this method, each compute node remotely retrieves a small set of geometry and cross-section data as needed and remotely accumulates local tallies when crossing the boundary of the local spatial domain. initial results demonstrate that while the method does allow large problems to be run in a memory-limited environment, achieving scalability may be difficult due to inefficiencies in the current implementation of RMA operations.

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
One forthcoming challenge in the area of high-performance computing is having the ability to run large-scale problems while coping with less memory per compute node. In this work, they investigate a novel data decomposition method that would allow Monte Carlo transport calculations to be performed on systems with limited memory per compute node. In this method, each compute node remotely retrieves a small set of geometry and cross-section data as needed and remotely accumulates local tallies when crossing the boundary of the local spatial domain. initial results demonstrate that while the method does allow large problems to be run in a memory-limited environment, achieving scalability may be difficult due to inefficiencies in the current implementation of RMA operations.

Solving Software Challenges for Exascale

Solving Software Challenges for Exascale PDF Author: Stefano Markidis
Publisher: Springer
ISBN: 3319159763
Category : Computers
Languages : en
Pages : 154

Book Description
This volume contains the thoroughly refereed post-conference proceedings of the Second International Conference on Exascale Applications and Software, EASC 2014, held in Stockholm, Sweden, in April 2014. The 6 full papers presented together with 6 short papers were carefully reviewed and selected from 17 submissions. They are organized in two topical sections named: toward exascale scientific applications and development environment for exascale applications.

Parallel Computing is Everywhere

Parallel Computing is Everywhere PDF Author: S. Bassini
Publisher: IOS Press
ISBN: 1614998434
Category : Computers
Languages : en
Pages : 852

Book Description
The most powerful computers work by harnessing the combined computational power of millions of processors, and exploiting the full potential of such large-scale systems is something which becomes more difficult with each succeeding generation of parallel computers. Alternative architectures and computer paradigms are increasingly being investigated in an attempt to address these difficulties. Added to this, the pervasive presence of heterogeneous and parallel devices in consumer products such as mobile phones, tablets, personal computers and servers also demands efficient programming environments and applications aimed at small-scale parallel systems as opposed to large-scale supercomputers. This book presents a selection of papers presented at the conference: Parallel Computing (ParCo2017), held in Bologna, Italy, on 12 to 15 September 2017. The conference included contributions about alternative approaches to achieving High Performance Computing (HPC) to potentially surpass exa- and zetascale performances, as well as papers on the application of quantum computers and FPGA processors. These developments are aimed at making available systems better capable of solving intensive computational scientific/engineering problems such as climate models, security applications and classic NP-problems, some of which cannot currently be managed by even the most powerful supercomputers available. New areas of application, such as robotics, AI and learning systems, data science, the Internet of Things (IoT), and in-car systems and autonomous vehicles were also covered. As always, ParCo2017 attracted a large number of notable contributions covering present and future developments in parallel computing, and the book will be of interest to all those working in the field.

Scalable Domain Decomposed Monte Carlo Particle Transport

Scalable Domain Decomposed Monte Carlo Particle Transport PDF Author: Matthew Joseph O'Brien
Publisher:
ISBN: 9781321212600
Category :
Languages : en
Pages :

Book Description
In this dissertation, we present the parallel algorithms necessary to run domain decomposedMonte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costlythan the numerical simulation. The main algorithms we consider are:* Domain decomposition of constructive solid geometry: enables extremely largecalculations in which the background geometry is too large to fit in the memory of asingle computational node.* Load Balancing: keeps the workload per processor as even as possible so the calculationruns efficiently.* Global Particle Find: if particles are on the wrong processor, globally resolve theirlocations to the correct processor based on particle coordinate and background domain.* Visualizing constructive solid geometry, sourcing particles, deciding that particlestreaming communication is completed and spatial redecomposition.These algorithms are some of the most important parallel algorithms required for domaindecomposed Monte Carlo particle transport. We demonstrate that our previous algorithmswere not scalable, prove that our new algorithms are scalable, and run some of the algorithmsup to 2 million MPI processes on the Sequoia supercomputer.

High Performance Computing & Monte Carlo

High Performance Computing & Monte Carlo PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 2

Book Description
High performance computing (HPC), used for the most demanding computational problems, has evolved from single processor custom systems in the 1960s and 1970s, to vector processors in the 1980s, to parallel processors in the 1990s, to clusters of commodity processors in the 2000s. Performance/price has increased by a factor of more than I million over that time, so that today's desktop PC is more powerful than yesterday's supercomputer. With the introduction of inexpensive Linux clusters and the standardization of parallel software through MPI and OpenMP, parallel computing is now widespread and available to everyone. Monte Carlo codes for particle transport are especially well-positioned to take advantage of accessible parallel computing, due to the inherently parallel nature of the computational algorithm. We review Monte Carlo particle parallelism, including the basic algorithm, load-balancing, fault tolerance, and scaling, using MCNP5 as an example. Due to memory limitations, especially on single nodes of Linux clusters, domain decomposition has been tried, with partial success. We conclude with a new scheme, data decomposition, which holds promise for very large problems.

Parallel Algorithms for Monte Carlo Particle Transport Simulation on Exascale Computing Architectures

Parallel Algorithms for Monte Carlo Particle Transport Simulation on Exascale Computing Architectures PDF Author: Paul Kollath Romano
Publisher:
ISBN:
Category :
Languages : en
Pages : 199

Book Description
Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallal efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O([square root]N) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes - in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with measured data from simulations in OpenMC on a full-core benchmark problem. Finally, a novel algorithm for decomposing large tally data was proposed, analyzed, and implemented/tested in OpenMC. The algorithm relies on disjoint sets of compute processes and tally servers. The analysis showed that for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead. Tests were performed on Intrepid and Titan and demonstrated that the algorithm did indeed perform well over a wide range of parameters.

Challenges of Monte Carlo Transport

Challenges of Monte Carlo Transport PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 49

Book Description
These are slides from a presentation for Parallel Summer School at Los Alamos National Laboratory. Solving discretized partial differential equations (PDEs) of interest can require a large number of computations. We can identify concurrency to allow parallel solution of discrete PDEs. Simulated particles histories can be used to solve the Boltzmann transport equation. Particle histories are independent in neutral particle transport, making them amenable to parallel computation. Physical parameters and method type determine the data dependencies of particle histories. Data requirements shape parallel algorithms for Monte Carlo. Then, Parallel Computational Physics and Parallel Monte Carlo are discussed and finally the results are given. The mesh passing method greatly simplifies the IMC implementation and allows simple load-balancing. Using MPI windows and passive, one-sided RMA further simplifies the implementation by removing target synchronization. The author is very interested in implementations of PGAS that may allow further optimization for one-sided, read-only memory access (e.g. Open SHMEM). The MPICH_RMA_OVER_DMAPP option and library is required to make one-sided messaging scale on Trinitite - Moonlight scales poorly. Interconnect specific libraries or functions are likely necessary to ensure performance. BRANSON has been used to directly compare the current standard method to a proposed method on idealized problems. The mesh passing algorithm performs well on problems that are designed to show the scalability of the particle passing method. BRANSON can now run load-imbalanced, dynamic problems. Potential avenues of improvement in the mesh passing algorithm will be implemented and explored. A suite of test problems that stress DD methods will elucidate a possible path forward for production codes.

Parallel Processing Monte Carlo Radiation Transport Codes

Parallel Processing Monte Carlo Radiation Transport Codes PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 7

Book Description
Issues related to distributed-memory multiprocessing as applied to Monte Carlo radiation transport are discussed. Measurements of communication overhead are presented for the radiation transport code MCNP which employs the communication software package PVM, and average efficiency curves are provided for a homogeneous virtual machine.

Design, Implementation and Optimization of a Parallel Monte Carlo Particle Transport Code

Design, Implementation and Optimization of a Parallel Monte Carlo Particle Transport Code PDF Author: J. M. Taylor
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description


Scalable Domain Decomposed Monte Carlo Particle Transport

Scalable Domain Decomposed Monte Carlo Particle Transport PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 190

Book Description
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.