Develop debug and test a program in either a highlevel langu
Develop, debug, and test a program in either a high-level language or macro language of your choice to implement linear regression. Among other things: (a) include statements to document the code, and (b) determine the standard error and the coefficient of determination
Solution
PVM.rapply
implements a preliminary version of
parallel apply function. It divides a matrix up by
rows, sends the function to
apply
and the sub-
matrices to slave tasks and collects the results at the
end. It is assumed that the slave script knows how
to evaluate the function and returns a scalar for each
row.
PVM.rapply <-
function(X, FUN = mean, NTASK = 1) {
## arbitrary integers tag message intent
WORKTAG <- 22
RESULTAG <- 33
end <- nrow(X)
chunk <- end %/% NTASK + 1
start <- 1
## Register process with pvm daemon
mytid <- .PVM.mytid()
## Spawn R slave tasks
children <- .PVM.spawnR(ntask = NTASK,
slave = \"slapply\")
## One might check if spawning successful,
## i.e. entries of children >= 0 ...
## If OK then deliver jobs
for(id in 1:length(children)) {
## for each child
## initialize message buffer for sending
.PVM.initsend()
Vol. 1/3, September 2001
7
## Divide the work evenly (simple-minded)
range <- c(start,
ifelse((start+chunk-1) > end,
end,start+chunk-1))
## Take a submatrix
work <-
X[(range[1]):(range[2]),,drop=FALSE]
start <- start + chunk
## Pack function name as a string
.PVM.pkstr(deparse(substitute(FUN)))
## Id identifies the order of the job
.PVM.pkint(id)
## Pack submatrix
.PVM.pkdblmat(work)
## Send work
.PVM.send(children[id], WORKTAG)
}
## Receive any outstanding result
## (vector of doubles) from each child
partial.results <- list()
for(child in children) {
## Get message of type result from any
## child.
.PVM.recv(-1, RESULTAG)
order <- .PVM.upkint()
## unpack result and restore the order
partial.results[[order]] <-
.PVM.upkdblvec()
}
## unregister from pvm
.PVM.exit()
return(unlist(partial.result
The corresponding slave script ‘
slapply.R
’ is
WORKTAG <- 22; RESULTAG <- 33
## Get parent task id and register
myparent <- .PVM.parent()
## Receive work from parent (a matrix)
buf <- .PVM.recv(myparent, WORKTAG)
## Get function to apply
func <- .PVM.upkstr()
## Unpack data (order, partial.work)
order <- .PVM.upkint()
partial.work <- .PVM.upkdblmat()
## actual computation, using apply
partial.result <- apply(partial.work,1,func)
## initialize send buffer
.PVM.initsend()
## pack order and partial.result
.PVM.pkint(order)
.PVM.pkdblvec(partial.result)
For parallel Monte Carlo, we need reliable paral-
lel random number generators. The requirements
of reproducibility, and hence validation of quality,
is important. It isn’t clear that selecting different
choices of starting seeds for each node will guaran-
tee good randomness properties. The Scalable Par-
allel Random Number Generators (
SPRNG
,
http://
sprng.cs.fsu.edu/
) library is one possible candi-
date. We are working toward incorporating
SPRNG
into
rpvm
by providing some wrapper functions as
well as utilizing existing R functions to generate ran-
dom numbers from different distributions.
Another challenging problem is to pass higher
level R objects through
PVM
. Because internal data
formats may vary across different hosts in the net-
work, simply sending in binary form may not work.
Conversion to characters (serialization) appears to be
the best solution but there is non-trivial overhead for
packing and then sending complicated and/or large
objects. This is a similar to the problem of reading in
data from files and determining proper data types.
Another future issue is to deploy
rpvm
on Mi-
crosoft Windows workstations. Both
PVM
and R are
available under Microsoft Windows, and this is one
solution for using additional compute cycles in



