Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF full book. Access full book title Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access by . Download full books in PDF and EPUB format.

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
One forthcoming challenge in the area of high-performance computing is having the ability to run large-scale problems while coping with less memory per compute node. In this work, they investigate a novel data decomposition method that would allow Monte Carlo transport calculations to be performed on systems with limited memory per compute node. In this method, each compute node remotely retrieves a small set of geometry and cross-section data as needed and remotely accumulates local tallies when crossing the boundary of the local spatial domain. initial results demonstrate that while the method does allow large problems to be run in a memory-limited environment, achieving scalability may be difficult due to inefficiencies in the current implementation of RMA operations.

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access

Towards Scalable Parellelism in Monte Carlo Particle Transport Codes Using Remote Memory Access PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
One forthcoming challenge in the area of high-performance computing is having the ability to run large-scale problems while coping with less memory per compute node. In this work, they investigate a novel data decomposition method that would allow Monte Carlo transport calculations to be performed on systems with limited memory per compute node. In this method, each compute node remotely retrieves a small set of geometry and cross-section data as needed and remotely accumulates local tallies when crossing the boundary of the local spatial domain. initial results demonstrate that while the method does allow large problems to be run in a memory-limited environment, achieving scalability may be difficult due to inefficiencies in the current implementation of RMA operations.

Scalable Domain Decomposed Monte Carlo Particle Transport

Scalable Domain Decomposed Monte Carlo Particle Transport PDF Author: Matthew Joseph O'Brien
Publisher:
ISBN: 9781321212600
Category :
Languages : en
Pages :

Book Description
In this dissertation, we present the parallel algorithms necessary to run domain decomposedMonte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costlythan the numerical simulation. The main algorithms we consider are:* Domain decomposition of constructive solid geometry: enables extremely largecalculations in which the background geometry is too large to fit in the memory of asingle computational node.* Load Balancing: keeps the workload per processor as even as possible so the calculationruns efficiently.* Global Particle Find: if particles are on the wrong processor, globally resolve theirlocations to the correct processor based on particle coordinate and background domain.* Visualizing constructive solid geometry, sourcing particles, deciding that particlestreaming communication is completed and spatial redecomposition.These algorithms are some of the most important parallel algorithms required for domaindecomposed Monte Carlo particle transport. We demonstrate that our previous algorithmswere not scalable, prove that our new algorithms are scalable, and run some of the algorithmsup to 2 million MPI processes on the Sequoia supercomputer.

Design, Implementation and Optimization of a Parallel Monte Carlo Particle Transport Code

Design, Implementation and Optimization of a Parallel Monte Carlo Particle Transport Code PDF Author: J. M. Taylor
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description


Parallel Processing Monte Carlo Radiation Transport Codes

Parallel Processing Monte Carlo Radiation Transport Codes PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 7

Book Description
Issues related to distributed-memory multiprocessing as applied to Monte Carlo radiation transport are discussed. Measurements of communication overhead are presented for the radiation transport code MCNP which employs the communication software package PVM, and average efficiency curves are provided for a homogeneous virtual machine.

Parallel Algorithms for Monte Carlo Particle Transport Simulation on Exascale Computing Architectures

Parallel Algorithms for Monte Carlo Particle Transport Simulation on Exascale Computing Architectures PDF Author: Paul Kollath Romano
Publisher:
ISBN:
Category :
Languages : en
Pages : 199

Book Description
Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallal efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O([square root]N) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes - in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with measured data from simulations in OpenMC on a full-core benchmark problem. Finally, a novel algorithm for decomposing large tally data was proposed, analyzed, and implemented/tested in OpenMC. The algorithm relies on disjoint sets of compute processes and tally servers. The analysis showed that for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead. Tests were performed on Intrepid and Titan and demonstrated that the algorithm did indeed perform well over a wide range of parameters.

High Performance Computing & Monte Carlo

High Performance Computing & Monte Carlo PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 2

Book Description
High performance computing (HPC), used for the most demanding computational problems, has evolved from single processor custom systems in the 1960s and 1970s, to vector processors in the 1980s, to parallel processors in the 1990s, to clusters of commodity processors in the 2000s. Performance/price has increased by a factor of more than I million over that time, so that today's desktop PC is more powerful than yesterday's supercomputer. With the introduction of inexpensive Linux clusters and the standardization of parallel software through MPI and OpenMP, parallel computing is now widespread and available to everyone. Monte Carlo codes for particle transport are especially well-positioned to take advantage of accessible parallel computing, due to the inherently parallel nature of the computational algorithm. We review Monte Carlo particle parallelism, including the basic algorithm, load-balancing, fault tolerance, and scaling, using MCNP5 as an example. Due to memory limitations, especially on single nodes of Linux clusters, domain decomposition has been tried, with partial success. We conclude with a new scheme, data decomposition, which holds promise for very large problems.

Monte Carlo Methods for Particle Transport

Monte Carlo Methods for Particle Transport PDF Author: Alireza Haghighat
Publisher:
ISBN: 9781523107568
Category : Monte Carlo method
Languages : en
Pages : 273

Book Description
The Monte Carlo method has become the de facto standard in radiation transport. Although powerful, if not understood and used appropriately, the method can give misleading results. Monte Carlo Methods for Particle Transport teaches appropriate use of the Monte Carlo method, explaining the method's fundamental concepts as well as its limitations. Concise yet comprehensive, this well-organized text: introduces the particle importance equation and its use for variance reduction; describes general and particle-transport-specific variance reduction techniques; presents particle transport eigenvalue issues and methodologies to address these issues; explores advanced formulations based on the author's research activities; discusses parallel processing concepts and factors affecting parallel performance.

Massively Parallel Algorithms for Method of Characteristics Neutral Particle Transport on Shared Memory Computer Architectures

Massively Parallel Algorithms for Method of Characteristics Neutral Particle Transport on Shared Memory Computer Architectures PDF Author: William Robert Dawson Boyd (III.)
Publisher:
ISBN:
Category :
Languages : en
Pages : 203

Book Description
Over the past 20 years, parallel computing has enabled computers to grow ever larger and more powerful while scientific applications have advanced in sophistication and resolution. This trend is being challenged, however, as the power consumption for conventional parallel computing architectures has risen to unsustainable levels and memory limitations have come to dominate compute performance. Multi-core processors and heterogeneous computing platforms, such as Graphics Processing Units (GPUs), are an increasingly popular paradigm for resolving these issues. This thesis explores the applicability of shared memory parallel platforms for solving deterministic neutron transport problems. A 2D method of characteristics code - OpenMOC - has been developed with solvers for shared memory multi-core platforms as well as GPUs. The multi-threading and memory locality methodologies for the multi-core CPU and GPU solvers are presented. Parallel scaling results using OpenMP demonstrate better than ideal weak scaling and nearly perfect strong scaling on both Intel Xeon and IBM Blue Gene/Q architectures. Performance results for the 2D C5G7 benchmark demonstrate up to 50x speedup for MOC on a GPU. The lessons learned from this thesis will provide the basis for further exploration of MOC on many-core platforms and GPUs as well as design decisions for hardware vendors exploring technologies for the next generation of machines for scientific computing.

Scalable Domain Decomposed Monte Carlo Particle Transport

Scalable Domain Decomposed Monte Carlo Particle Transport PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 190

Book Description
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.

Challenges of Monte Carlo Transport

Challenges of Monte Carlo Transport PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 49

Book Description
These are slides from a presentation for Parallel Summer School at Los Alamos National Laboratory. Solving discretized partial differential equations (PDEs) of interest can require a large number of computations. We can identify concurrency to allow parallel solution of discrete PDEs. Simulated particles histories can be used to solve the Boltzmann transport equation. Particle histories are independent in neutral particle transport, making them amenable to parallel computation. Physical parameters and method type determine the data dependencies of particle histories. Data requirements shape parallel algorithms for Monte Carlo. Then, Parallel Computational Physics and Parallel Monte Carlo are discussed and finally the results are given. The mesh passing method greatly simplifies the IMC implementation and allows simple load-balancing. Using MPI windows and passive, one-sided RMA further simplifies the implementation by removing target synchronization. The author is very interested in implementations of PGAS that may allow further optimization for one-sided, read-only memory access (e.g. Open SHMEM). The MPICH_RMA_OVER_DMAPP option and library is required to make one-sided messaging scale on Trinitite - Moonlight scales poorly. Interconnect specific libraries or functions are likely necessary to ensure performance. BRANSON has been used to directly compare the current standard method to a proposed method on idealized problems. The mesh passing algorithm performs well on problems that are designed to show the scalability of the particle passing method. BRANSON can now run load-imbalanced, dynamic problems. Potential avenues of improvement in the mesh passing algorithm will be implemented and explored. A suite of test problems that stress DD methods will elucidate a possible path forward for production codes.