Contents

Introduction

A lot of XSEDE users request allocations on SDSC Trestles and Gordon because we are one of two XSEDE sites (the other being PSC and Blacklight) with a Gaussian license that permits XSEDE users to use the software. I've found that many start-up allocations wanting to use Gaussian involve users who have never used a batch system, a remote resource, or even a command line. In the interest of providing a very quick crash course for such users, here are my notes on making the jump from Gaussian on a PC or workstation to Gaussian on an SDSC XSEDE resource.

This guide assumes the reader has never used a batch system, an XSEDE resource, or the Linux command line. Since Trestles gets most of the new Gaussian users on XSEDE, I will assume the reader is using that system. Instructions for using Gordon are quite similar.

Logging into SDSC Trestles

The XSEDE User Portal has a guide to getting started, and it covers all the options about which most users will want to know. All those options can be confusing at first, so for the sake of keeping it as simple as possible, I'll lay out every step.

  1. Go to the XSEDE User Portal
  2. Log in with your XSEDE username and password. If you do not have an XSEDE User Portal account, you will have to create one and then get your project PI (your supervisor) to add that account to your group's project
  3. Under the "MY XSEDE" tab, click the "Accounts" option
  4. Scroll down to Trestles and click the link under the "Login Name" column. This will take you to the GSI-SSH Terminal Java applet which will take a while to load, then dump you at a black screen with white text.

This black screen should look something like this:

Rocks 5.4 (Maverick)
Profile built 10:09 13-Jan-2012

Kickstarted 10:17 13-Jan-2012
trestles Login Node
---------------------------------------------------------------------------
Welcome to the SDSC Trestles Appro Cluster

Trestles User Guide: http://www.sdsc.edu/us/resources/trestles
Questions: email help@xsede.org 
---------------------------------------------------------------------------

[username@trestles-login2 ~]$ 

This is the Linux terminal, and the last line is your prompt which lists your username, your current machine (trestles-login1 or trestles-login2), your current directory (~ is an abbreviation for your home directory), and a dollar sign ($) which means you are logged in as a regular (not administrative) user.

Typographic conventions hold that commands you are supposed to type at the command prompt (also called "the shell") be preceded by a $ to represent the shell prompt. So, if this guide says to issue the following command:

$ pwd

you don't actually type the dollar sign. It's just there to tell you to type the pwd command in the Linux shell. I also will forego the black background from my samples below. You know what your terminal looks like.

Getting Permission and Loading Gaussian

Because Gaussian requires a license to use, new login accounts must be given permission to run Gaussian before they can actually use it. Chances are you will need to request this permission by sending an email to help@xsede.org. Once your request is processed, it may take a few hours for the changes to take effect. If you want to check to see if you can run Gaussian, you can use the groups command:

$ groups
rut100 gaussian

If you do not see gaussian listed in the output of this command, you will not be able to run Gaussian!

Once your account has been enabled for Gaussian, you can load its associated module with this command:

$ module load gaussian

This will give you access the Gaussian commands like formchk, unfchk, and of course g09. However, do not skip ahead and just start running g09! If you do, you will make a lot of other users upset and you will get a sternly worded e-mail from me or one of my colleagues.

Gaussian Job Setup

At this point I assume you have a Gaussian job you want to run, and it consists of the following files on your personal computer:

Creating a job directory

The first thing you want to do is create a directory in which you want all this simulation's data to reside. Do

$ mkdir job1

to create a directory called job1. To then go into that directory,

$ cd job1

Transferring files to Trestles

Now you need to transfer your Gaussian input files from your computer to Trestles. The easiest way to do that is using the XSEDE File Manager, which is a Java applet that allows you to drag-and-drop files from your personal computer to any XSEDE resource. On the left will be your local files, and on the right is a list of XSEDE resources on which you have an account. Double click "SDSC Trestles Appro Rocks Cluster" to connect to it, and your job1 directory should appear. Double click it, and drag-and-drop your Gaussian job files onto Trestles.

XUP File Manager screen shot

Back in your terminal session, you should be able to type the ls command and see the files you just uploaded.

$ ls
input.com  molecule.chk

Setting up the queue script

Up until now, the steps have been very generic and can be used by any user to get started on Trestles. However to actually run jobs on Trestles, Gordon, or any other XSEDE supercomputer, you will have to interact with the batch system which is really what distinguishes using a shared supercomputer from using your personal computer.

At SDSC we use use the Torque Resource Manager which is comprised of a number of commands (e.g., qsub, qstat, qdel, and qmod), and running your simulation through the batch system requires a queue script to "glue" together the inner workings of Torque and Gaussian.

The name of this queue script is arbitrary, but I like to give them the extensions of .qsub. So, you will have to create a file called g09job.qsub using a command-line text editor. The nano editor is perhaps the easiest to use. Issue this command:

$ nano g09job.qsub

to create and edit a file called g09job.qsub. You will see a screen like this:

  GNU nano 1.3.12             File: g09job.qsub                                 







^G Get Help  ^O WriteOut  ^R Read File ^Y Prev Page ^K Cut Text  ^C Cur Pos
^X Exit      ^J Justify   ^W Where Is  ^V Next Page ^U UnCut Text^T To Spell

Some common nano commands are shown at the bottom: ctrl+x exits, ctrl+w to search, etc. You will need to paste the following lines into this new g09job.qsub file:

#!/bin/bash
#PBS -q shared
#PBS -l nodes=1:ppn=16
#PBS -l walltime=02:30:00
 
. /etc/profile.d/modules.sh
module load gaussian
 
cd $PBS_O_WORKDIR
export GAUSS_SCRDIR=/scratch/${USER}/${PBS_JOBID}
g09 < input.com > output.txt

Now exit nano (ctrl+x) and say yes to "Save modified buffer (ANSWERING "No" WILL DESTROY CHANGES) ?" to save your changes. This is the absolute bare minimum queue script you will need to run a Gaussian job, and the details of what each line means can be found in the Trestles User Guide and at the end of this tutorial. For now, there are only two important lines. The first one is

#PBS -l nodes=1:ppn=16

This tells Torque that your job will require one node and sixteen CPU cores on that node. You will then have to modify your Gaussian input file, input.com, to actually use these sixteen cores. Open up that input.com file in nano and make sure the following red Link 0 commands are present above the Route section:

%chk=molecule.chk
%nproc=16
%mem=31GB
#p m062x/6-31+g(d) td=(Root=1,NStates=1) Freq NoSymm

The %nproc option tells Gaussian to use 16 cores, which must be the same as what your queue script requests. The %mem option specifies how much memory Gaussian can use. On Trestles, there is a max of 2 GB available per core, but it is good practice to not specify this absolute max since the operating system and other system programs on the node will also need some memory.

Everything else in our Gaussian input file can remain unchanged.

Running Gaussian

Once you have your input file set up, you still cannot run Gaussian yet. Unlike a workstation where you can just use the g09 command, Trestles (and all modern supercomputers) requires you to submit your job to a batch system that schedules and launches everyone's job in a fair manner.

Getting the Job Script

You will have to submit jobs to Trestles using the qsub command and a job submission script which contains more Linux terminal commands to be executed on one of the compute nodes. I created a sample job script which you can copy. To do that, use

$ cp /home/glock/tutorials/gaussian/g09job.qsub /home/username/job1/

If you want to copy my input.com as well, you can do

$ cp /home/glock/tutorials/gaussian/input.com /home/username/job1/

You can also see what other files I have in my sample directory using ls /home/glock/tutorials/gaussian/.

Setting your Walltime

Now if you ls from within your job1 directory, you should see input.com, molecule.chk (if you had a checkpoint file you transferred from your personal computer), and g09job.qsub. You can view the contents of the submit script (g09job.qsub) by typing cat g09job.qsub. If you want to edit it, you can use the "nano" editor (e.g., nano g09job.qsub). To exit nano, press ctrl+x. You can google for "nano editor tutorial" to learn more about using nano.

If you want to edit g09job.qsub, I recommend changing the line which reads

#PBS -l walltime=48:00:00

That line says that your job needs 48 hours to complete; if you know your job takes less time (e.g., my sample input.com takes just a few minutes) you can change that to, say,

#PBS -l walltime=00:15:00

for 15 minutes.

Submitting your Job

To actually run your Gaussian simulation, use the qsub command:

$ qsub g09job.qsub

The job may sit in queue for a while, and you can check its status by typing qsub -u username. The second-to-last column (labeled "S") is the job state. Q means it's in queue, R means it is running, and C means the job has finished.

Once your job finishes, you should have a new file called output.txt which you can view using cat, edit using nano, and download to your computer using the XSEDE File Manager.

You can find a few more Gaussian submit scripts in our GitHub repository for Trestles or our GitHub repository for Gordon.