Here are some resources to be used in conjunction with the SDSC Summer Institute hands-on session on Parallel Options for R.

The most recent pre-release slides for this talk
(Last updated Thursday at 9:04 AM)

Accompanying Guides

Examples Codes

I've put a tarball on Gordon which contains all of the examples I am covering in my talk on Thursday. You can extract them to your home directory by issuing the following commands:

$ cd
$ tar zxvf /home/diag/SI2013-R/parallel_r.tar.gz

You will wind up with four directories and a combination of sample R scripts (*.R) and the job submission scripts necessary to submit them (*.qsub):

You should be able to simply cd into the appropriate directory and do something like qsub wordcount-rhipe.qsub to run the job. No modification should be necessary.

Fun fact: you can compare some of these files side-by-side to see how similar (or different) RHadoop is from RHIPE:

$ vimdiff rhipe/wordcount-rhipe.R rhadoop/wordcount-rhadoop.R

Here are download links for the scripts and the text of Moby Dick used in the Hadoop examples. You can copy+paste the kmeans examples straight from GitHub into your laptop's R installation to try out the multicore examples.