Table of Contents
Introduction
A common request we get is to install some software package on either Gordon or Trestles, sometimes with the implicit assumption that it should be a simple matter of doing sudo apt-get install somepackage. Unfortunately, installing a new piece of software on a shared resource like Gordon or Trestles is not that easy because
- we need to make sure the software will not break another library or program on the system
- we will typically have to install it on all of the compute nodes too, not just the login nodes
- we then have to support that software (and any/all users' questions about it) since we officially provide it
- our system engineers, who can actually deploy packages, are not the same applications experts who compile the packages
The net result is that installing a new software package system-wide can take weeks or months to do. When I get requests to install software, I invariably respond that it would be easier and faster for the user (that's you) to just install the package himself, and provide step-by-step instructions on exactly how to do that.
For the sake of anyone who wants to know how to install his or her own software applications on SDSC Trestles or Gordon (or any other Linux or UNIX machine, for that matter), here are some generic guidelines on how to do this.
Python Modules
There are several ways Python will let you manage your own set of libraries, and I find the virtualenv package to be the easiest. It creates what amounts to an installation of Python that is local to your home directory, which means that any libraries you install using that special personalized Python will also install into your home directory.
First, download virtualenv from the project's website, e.g.,
$ wget --no-check-certificate "https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.2.tar.gz"
Then be sure to load the python module that you wish to clone. This
is important because the python
that you will run if you don't
explicitly load a Python module is the system-wide default, Python 2.6.6.
I recommend using the Python 2.7.x module we provide on both machines, so load
that module before trying to install virtualenv:
$ module load python
You must now decide the location of this custom Python you want to install
using virtualenv. ~/python27-gordon
is a good choice, assuming you are
using Python 2.7 as previously discussed. Then, unpacking and installing
virtualenv is a snap:
$ tar zxvf virtualenv-1.11.2.tar.gzvirtualenv-1.11.2/virtualenv-1.11.2/AUTHORS.txt...virtualenv-1.11.2/virtualenv_support/pip-1.5.2-py2.py3-none-any.whlvirtualenv-1.11.2/virtualenv_support/setuptools-2.1-py2.py3-none-any.whl$ python virtualenv-1.11.2/virtualenv.py ~/python27-gordonNew python executable in /home/username/python27-gordon/bin/pythonInstalling setuptools, pip...done.
And that's all you have to do. Now whenever you want to use your custom Python installation, you will have to issue this command:
$ source ~/python27-gordon/bin/activate(python27-gordon)$
As you may notice, it modifies your prompt to show that you are in this
custom Python's "virtual environment." If you always plan on using this custom
Python, you can go ahead and add the following lines to your
~/.bashrc
:
module load pythonVIRTUAL_ENV_DISABLE_PROMPT=1 source python27-gordon/bin/activate
Note the VIRTUAL_ENV_DISABLE_PROMPT=1 preceding the "source" command--this option prevents that annoying prompt prefix that virtualenv will otherwise give you every time you log in.
Once you've got your virtualenv activated, installing new libraries is easy
using pip
:
(python27-gordon)$ pip install cutadaptDownloading/unpacking cutadaptDownloading cutadapt-1.3.tar.gz (149kB): 149kB downloadedRunning setup.py (path:/home/username/python27-gordon/build/cutadapt/setup.py) egg_info for package cutadapt...Successfully installed cutadaptCleaning up...(python27-gordon)$ pip install pyvcfDownloading/unpacking pyvcfDownloading PyVCF-0.6.4.tar.gz...Successfully installed pyvcf distribute setuptoolsCleaning up...
As you can see, pip automatically downloads and installs dependencies for you, making the task of managing Python libraries under your own user account on our supercomputers pretty easy.
Perl Modules
One of the standard ways of maintaining your own Perl libraries installed
into your home directory is using the local::lib
module which,
like Python's virtualenv, lets you emulate having Perl installed locally.
To get started with local::lib you've first got to download it, then unpack it:
$ wget http://search.cpan.org/CPAN/authors/id/E/ET/ETHER/local-lib-1.008010.tar.gz$ tar zxvf local-lib-1.008010.tar.gzlocal-lib-1.008010/local-lib-1.008010/Changeslocal-lib-1.008010/inc/...$ cd local-lib-1.008010
Unlike with Python, we do not have a separate Perl module that needs to be
loaded. Once you're in that local-lib-1.008010
directory, you can
initiate the bootstrap process by which local::lib
creates
your custom Perl installation and installs itself. Let's assume that we want
to install our custom Perl into ~/perl5-gordon
(note: use $HOME instead of ~
):
$ perl Makefile.PL --bootstrap=$HOME/perl5-gordonAttempting to create directory /home/username/perl5-gordon...
If you don't specify a path after the --bootstrap
flag, your
local::lib
installation will be in ~/perl5
. This
bootstrapping process may take a very long time as CPAN
needs to first configure itself, then install all of the libraries that
local::lib
needs to work. After a lot of text scrolls by
(many of which look like errors--this isn't necessarily bad), hopefully you
wind up at
...Checking if your kit is complete...Looks goodGenerating a GNU-style MakefileWriting Makefile for local::libWriting MYMETA.yml and MYMETA.json
Then test and install local::lib
:
$ make test...t/subroutine-in-inc.t .. okAll tests successful.Files=8, Tests=35, 0 wallclock secs ( 0.04 usr 0.03 sys + 0.23 cusr 0.07 csys = 0.37 CPU)Result: PASS$ make installInstalling /home/username/perl5-gordon/lib/perl5/POD2/PT_BR/local/lib.pod...Appending installation info to /home/username/perl5-gordon/lib/perl5/x86_64-linux-thread-multi/perllocal.pod
Now we need to put a few new lines in our ~/.bashrc
to
effectively do what that activate
script does for Python's
virtualenv. Issue the following command, then append its output into your
~/.bashrc
:
$ perl -I$HOME/perl5-gordon/lib/perl5 -Mlocal::lib=$HOME/perl5-gordon | tee -a ~/.bashrcexport PERL_LOCAL_LIB_ROOT="$PERL_LOCAL_LIB_ROOT:/home/username/perl5-gordon";export PERL_MB_OPT="--install_base /home/username/perl5-gordon";export PERL_MM_OPT="INSTALL_BASE=/home/username/perl5-gordon";export PERL5LIB="/home/username/perl5-gordon/lib/perl5:$PERL5LIB";export PATH="/home/username/perl5-gordon/bin:$PATH";
You should then either log out and log back in, or paste those export lines into your current terminal session to put them into effect. Following that, you should be able to install Perl libraries into your home directory:
$ perl -MCPAN -e 'install(Time::Piece)'Reading '/home/username/.cpan/Metadata'Database was generated on Mon, 16 Sep 2013 19:53:02 GMTRunning install for module 'Time::Piece'...RJBS/Time-Piece-1.23.tar.gz/usr/bin/make install -- OK
R Libraries
Users cannot install R libraries globally on our machines, but R makes it
very easy for users to install libraries in their home directories. To do
this, fire up R and when presented with the >
prompt, use the
install.packages() method to install things:
> install.packages('doSNOW')Installing package(s) into '/opt/R/local/lib'(as 'lib' is unspecified)Warning in install.packages("doSNOW") :'lib = "/opt/R/local/lib"' is not writableWould you like to use a personal library instead? (y/n)
This error comes up because you can't install libraries system-wide as a non-root user. Say y and accept the default which should be something similar to ~/R/x86_64-unknown-linux-gnu-library/3.0. Pick a mirror and let her rip. If you want to install multiple packages at once, you can just do something like
> install.packages(c('foreach','doMC'))
For most packages, this is all you will have to do. However, sometimes R packages depend on other system libraries, and those system libraries might not be in the default search path for the R package installer. When that happens, you might get an error that looks something like this:
> install.packages('rjags');* installing *source* package 'rjags' ...** package 'rjags' successfully unpacked and MD5 sums checkedchecking for prefix by checking for jags... noconfigure: error: "Location of JAGS headers not defined. Use configure arg '--with-jags-include' or environment variable 'JAGS_INCLUDE'"ERROR: configuration failed for package 'rjags'* removing '/home/glock/R/x86_64-unknown-linux-gnu-library/3.0/rjags'The downloaded source packages are in'/tmp/RtmpdE6UYF/downloaded_packages'Warning message:In install.packages("rjags") :installation of package ‘rjags’ had non-zero exit status
The relevant part of the error log is highlighted in red; the library could
not install because it depends on a system library (as opposed to an R library)
that the install.packages()
command could not find.
While fixing this error can be tricky since each R library can have a
different installation procedure, you can pass extra hints to the install.packages()
command to suggest where it can find some of these system libraries. For
example, the jags library is already installed on Gordon and Trestles, and it
can be loaded using module load jags. After doing this, you can
do something like
> install.packages('rjags',configure.args=c(rjags='--with-jags-include=$JAGSHOME/include/JAGS--with-jags-lib=$JAGSHOME/lib'))* installing *source* package 'rjags' ...** package 'rjags' successfully unpacked and MD5 sums checkedchecking for prefix by checking for jags... /opt/jags/bin/jagschecking whether the C++ compiler works... yes...** testing if installed package can be loaded* DONE (rjags)
The green text highlights something that looks gnarly, but actually tells R that
- the contents of
configure.args
should be passed to the underlying library's installer.configure.args
is a named list containing special configuration parameters, where the name of each value corresponds to the package to which the special parameters apply. - the
--with-jags-include
and--with-jags-lib
signal the rjags installer where your JAGS library'sinclude
andlib
directories are located - $JAGSHOME is a variable that gets defined when you load the
jags
module on Gordon and Trestles. On other systems, you would specify the full path to your jags installation directory instead.
Actually knowing what configure.args
to use when generic package
installations fail requires some amount of intuition. If all else fails,
contact the help desk!