Multi-Core Parallel Computing with ISIS
ISIS achieves this parallelization by distributing the computation across multiple CPU cores in a way that requires no additional effort on the part of the user, no additional software dependencies, and is compatible with legacy codes.
Along with the built-in support for parallel fitting, a simple interface is provided so that users can write parallel scripts.
For more realistic applications of multi-core parallelism, look in the ISIS source code distribution at the implementation of the parallel Levenberg-Marquardt optimizer, plm, in the file share/plm.sl, or at the implementation of conf_loop, in the file share/conf_loop.sl.
The degree of parallelism that can be achieved with this method is effectively limited by the number of compute cores available in a single machine. Larger scale parallelism can be achieved using the PVM module, but the effort required to set up and run such a calculation can be non-trivial.
ISIS automatically performs the following tasks in parallel, using all available cpus:
Here are a few usage examples:
% Select a parallel fit method, % then perform a fit, using all available cores: isis> set_fit_method ("plm"); isis> fit_counts; % Select a serial fit method, then compute single-parameter % confidence limits for parameters 2,4,7 and 10. % The necessary fits will be performed in parallel % on all available cores: isis> set_fit_method ("lmdif"); isis> (pmin, pmax) = conf_loop ([2,4,7,10]); % Repeat the conf_loop calculation using only 2 slave % processes, and running the slaves at reduced priority: isis> (pmin, pmax) = conf_loop ([2,4,7,10]; num_slaves=2, nice=10);To demonstrate the speed-up obtained by parallel processing, we computed 2D confidence contour maps of size NxN, with N=4,8,12,...64, for a 3 parameter fit to a single dataset with 8192 bins. These maps were first generated using a serial algorithm and then again, using a parallel algorithm, with 4 slave processes on a machine with 4 compute cores (two cpus, each a 2.6 GHz Dual-Core AMD Opteron(tm) Processor 2218).
The figure below shows the CPU time required to generate each NxN confidence contour map, with the serial times in black and the parallel times in red. For this calculation, on this machine, the parallel computation is roughly 3x faster than the corresponding serial computation:
Note that the speed-up obtained by parallel processing is somewhat problem-dependent and will also depend on the available computer resources, both memory and cpu.
[ Made with JED | Best Viewed with a Browser | Valid HTML 4.01 | Valid CSS ]
This page is maintained by John C. Houck. Last updated: Apr 20, 2018