snowFT-cluster {snowFT} | R Documentation |
Functions that extend the collection of cluster-level functions of the
snow package while providing fault tolerance, reproducibility and additional
management features. The heart of the package is the function
performParallel
.
performParallel(count, x, fun, initfun = NULL, exitfun = NULL, printfun = NULL, printargs = NULL, printrepl = max(length(x)/10,1), cltype = getClusterOption("type"), cluster.args = NULL, gentype = "RNGstream", seed = sample(1:9999999,6), prngkind = "default", para = 0, mngtfiles = c(".clustersize",".proc",".proc_fail"), ft_verbose = FALSE, ...) clusterApplyFT(cl, x, fun, initfun = NULL, exitfun = NULL, printfun = NULL, printargs = NULL, printrepl = max(length(x)/10,1), gentype = "None", seed = rep(123456,6), prngkind = "default", para = 0, mngtfiles = c(".clustersize",".proc",".proc_fail"), ft_verbose = FALSE, ...) clusterCallpart(cl, nodes, fun, ...) clusterEvalQpart(cl, nodes, expr) printClusterInfo(cl)
count |
Number of cluster nodes. If |
cl |
Cluster object. |
x |
Vector of values to be passed to function |
fun |
Function or character string naming a function. |
initfun |
Function or character string naming a function with no arguments that is to be called on each node prior to the computation. It can be used for example for loading required libraries. |
exitfun |
Function or character string naming a function with no arguments that is to be called on each node after the computation is completed. |
printfun, printargs, printrepl |
|
cltype |
Character string that specifies cluster type (see
|
cluster.args |
List of arguments passed to the function |
gentype |
Character string that specifies the type of the random number generator (RNG).
Possible values: "RNGstream" (L'Ecuyer's RNG),
"SPRNG", or "None", see
|
seed, prngkind, para |
Seed, kind and parameters for the RNG (see
|
mngtfiles |
A character vector of length 3 containing names of
management files: |
ft_verbose |
If TRUE, debugging messages are sent to standard output. |
nodes |
Indices of cluster nodes. |
expr |
Expression to evaluate. |
... |
Additional arguments to pass to function |
clusterApplyFT
is a fault tolerant version of
clusterApplyLB
of the snow package with additional features, such as results
reproducibility, computation transparency and dynamic cluster
resizing. The master process searches for failed nodes in its
waiting time. If failures are detected, the cluster is
repaired. All failed computations are restarted (in three additional
runs) after the replication
loop is finished, and hence the user should not notice any
interruptions.
The file mngtfiles[1]
(which defaults to ‘.clustersize’) is initially written by the master
prior to the computation and it contains a single integer value corresponding
to the number of cluster nodes. Then the value can be arbitrarily changed by
the user (but should remain in the same format). The master reads the
file in its waiting time. If the value in this file is larger than
the current
cluster size, new nodes are created and the computation is expanded on
them. If on the other hand the value is smaller, nodes are
successively discarded after they finish their current
computation.
The arguments initfun, exitfun
in
clusterApplyFT
are only used, if there are
changes in the cluster, i.e. if new nodes are added or if nodes are
removed from cluster.
The RNG uses
the scheme 'one stream per replicate', in contrary to 'one stream per
node' used by clusterApplyLB
. Therefore with each replicate, the
RNG is reset to the corresponding stream (identified by the replicate
number). Thus, the final results are reproducible.
performParallel
is a wrapper function for
clusterApplyFT
and we recommend using this function rather than
using clusterApplyFT
directly. It creates a cluster of
count
nodes,
on all nodes it
calls initfun
and initializes the RNG. Then it calls
clusterApplyFT
. After the computation is finished, it calls
exitfun
on all nodes and stops the cluster. If count=0
, function fun
is invoked sequentially with the same settings (including random numbers) as it would in parallel. This mode can be used for debugging purposes.
clusterCallpart
calls a function fun
with identical arguments
...
on nodes
specified by indices nodes
in the cluster cl
and returns a list
of the results.
clusterEvalQpart
evaluates a literal expression on nodes
specified by indices nodes
.
printClusterInfo
prints out some basic information about the cluster.
clusterApplyFT
returns a list of two elements. The first
one is a list (of length |x|
) of results, the second one is the
(possibly updated)
cluster object.
performParallel
returns a list of results.
Hana Sevcikova
## Not run: # generates n normally distributed random numbers in r replicates # on p nodes and prints their mean after each r/10 replicate. printfun <- function(res, n, args=NULL) { res <- unlist(res) res <- res[!is.null(res)] print(paste("mean after:", n,"replicates:", mean(res), "(from",length(res),"RNs)")) } r<-1000; n<-100; p<-5 res <- performParallel(p, rep(n,r), fun=rnorm, gentype="RNGstream", seed=rep(1,6), printfun=printfun) # Setting p<-0 will run rnorm sequentially and should give # exactly the same results ## End(Not run)