EGSnrc C++ class library
Report PIRS-898 (2021)
Iwan Kawrakow, Ernesto Mainegra-Hing, Frederic Tessier, Reid Townson and Blake Walters
|
A job control object for homogeneous computing environments (HCE). More...
#include <egs_run_control.h>
Public Member Functions | |
EGS_UniformRunControl (EGS_Application *app) | |
void | describeRCO () |
int | startSimulation () |
int | finishSimulation () |
Uses 'watcher' jobs to determine if the simulation has finished. More... | |
Public Member Functions inherited from EGS_RunControl | |
EGS_RunControl (EGS_Application *app) | |
Creates an RCO for the application app. More... | |
virtual | ~EGS_RunControl () |
Destructor. | |
void | setNcase (EGS_I64 n) |
Set the number of particles to be simulated to n. | |
void | setNbatch (int n) |
Set the number of batches to n. | |
void | setMaxTime (EGS_Float t) |
Set the maximum CPU time for the simulation to t. | |
void | setRequiredUncertainty (EGS_Float a) |
Set the required statistical uncertainty to a. | |
EGS_I64 | getNcase () const |
Returns the total number of particles to be simulated. | |
int | getNbatch () const |
Returns the number of batches per simulation chunk. | |
int | getNchunk () const |
Returns the number of simulation chunks. | |
virtual EGS_I64 | getNextChunk () |
Returns the number of histories to run in the next simulation chunk. More... | |
virtual bool | startBatch (int, EGS_I64) |
Start a new batch. More... | |
virtual bool | finishBatch () |
Finish a batch. More... | |
virtual bool | storeState (ostream &data) |
virtual bool | setState (istream &data) |
virtual bool | addState (istream &data) |
virtual void | resetCounter () |
virtual bool | getCombinedResult (double &, double &) const |
virtual EGS_I64 | getNdone () const |
virtual void | setNdone (EGS_I64 Ndone) |
virtual void | incrementNdone () |
virtual EGS_Float | getCPUTime () const |
Protected Attributes | |
int | milliseconds |
int | check_intervals |
int | njob |
int | npar |
int | ipar |
int | ifirst |
bool | check_egsdat |
bool | watcher_job |
Protected Attributes inherited from EGS_RunControl | |
EGS_Application * | app |
EGS_Input * | input |
EGS_I64 | ncase |
EGS_I64 | ndone |
EGS_Float | maxt |
EGS_Float | accu |
int | nbatch |
int | restart |
int | nchunk |
RCOType | rco_type |
RCO type to use. | |
EGS_Timer | timer |
EGS_Float | cpu_time |
EGS_Float | previous_cpu_time |
Additional Inherited Members | |
Public Types inherited from EGS_RunControl | |
enum | RCOType { simple, uniform, balanced } |
Define RCO types. More... | |
Static Public Member Functions inherited from EGS_RunControl | |
static EGS_RunControl * | getRunControlObject (EGS_Application *) |
Public Attributes inherited from EGS_RunControl | |
int | geomErrorCount |
int | geomErrorMax |
A job control object for homogeneous computing environments (HCE).
The uniform RCO is used for controlling parallel job execution in computing environments (CE) with identical hardware, software and communication layer (aka homogeneous CE):
Assume last job finishes last and then cycles 5 times by default ('check_intervals' variable) for a period of time defined to be 1 s by default ('milliseconds' variable). Defaults can be changed via the 'run control' input block using:
interval wait time = time in ms (default 1 s) number of intervals = an_integer_value (default 5)
The last job combines the parallel runs by default. Since the last job could finish before some of the other jobs, users can set another job or several jobs to be 'watcher' jobs. In principle it is enough to define one 'watcher' job that waits long enough for all jobs to complete. To change the default, use the following key:
watcher jobs = job_i,..., job_j
If requested, a run-completion check can be made every cycle by checking that the number of *.egsdat files equals the number of parallel jobs submitted. This could speed things up by not having to wait for all checking cycles. However, it could also be the case that some jobs might have failed, in which case, after the checking cycles complete, only the available *.egsdat files will be combined.
This option can be set via the 'run control' input block using:
check jobs completed = yes|no # default is 'no'
When this option is enabled, each job erases at the beginning of the run its corresponding *.egsdat file if it exists.
Definition at line 365 of file egs_run_control.h.
|
virtual |
Uses 'watcher' jobs to determine if the simulation has finished.
If the current job is a 'watcher' job, it waits for some time before issuing the signal to recombine all available parallel jobs. These 'watcher' jobs can also produce intermediate results while waiting. If all jobs complete while waiting, the 'watcher' job combines all results and exits.
Reimplemented from EGS_RunControl.
Definition at line 854 of file egs_run_control.cpp.
References EGS_Application::combinePartialResults(), egsInformation, EGS_RunControl::finishSimulation(), EGS_Application::howManyJobsDone(), and rco_sleep().