Ualberta THOR Cluster

General Information | Environment | Executables | Scripts | Note from Peter Green

These are the instructions on running TWIST stuff on the Ualberta THOR linux cluster.

Most of the general information and the PBS instructions are from Peter Green. Many details are from Rob MacDonald.

General information

Main web page: http://thor-gw.phys.ualberta.ca/

How to login into the cluster? Use ssh e614@thor-gw.phys.ualberta.ca, password: the usual

What disks to use? From Peter Green: use /raid3/e614/yourname

Set $CAL_DB to: /raid3/e614/olchansk/caldb_ascii (use the $HOME/caldb_ascii_update.sh script to get the latest files from TRIUMF).

Setting up the environment

To set up the environment for mofia or geant, through the magic of aliases you can now just type "sourcemofia" or "sourcegeant" at the command line. These commands execute scripts in the ~/e614slow directory. The only difference is the mlib command that says which compiler to use (f90 for Mofia, f77 for Geant), so you can use either if you're just running.

Executables

TWIST executables have been successfully built on THOR. (Most notably, we now have a copy of the Absoft F90 compiler there.) However, you can aparently build them elsewhere and scp them to THOR. Executables built on redhat-6.2 (lin05.triumf.ca) and static-linked executables built on jam.triumf.ca are known to work.

Scripts for running

Rob MacDonald has placed some useful scripts in the directory $MOFIA_SOURCE/user:
analyzegeant_batch.kcm
KCM file (Mofia script) for analyzing Geant data (with magnet on) in "batch" mode. It loads the appropriate geometry files etc, starts the analysis, and quits Mofia when it's done.

You must check this file to make sure the geometry, map, and other auxilliary files match what Geant used to generate the data in the first place, or Mofia gets very confused and usually quits.

The MTIN variable is set in the PBS script below...

analyzegeant_THOR.pbs
PBS script for submitting a job to the THOR PBS system. This file is the "source" file used by writemofiabatch.pl (described below). You'll want to edit this file and change a few things:

The line #PBS -l nodes=twist restricts the job to the two machines that TWIST actually owns, plus the 10 general-use machines. This line was requested by the THOR sysadmin.

The other #PBS lines should be okay. More example PBS scripts are available here: /raid3/e614/olchansk/mc/scripts/*.perl (but this one should work fine).

writemofiabatch.pl
Perl script to generate individual PBS scripts for each run you want to analyze. Set things up the way in the file analyzegeant_THOR.pbs, then edit this perl script to set the range of run numbers you want to analyze. I expect everything else should be okay. Running this script generates a PBS script for each run number, replacing the line that sets the DATAFILE variable.

manyqsub
Shell script (in ~e614/bin) for submitting many batch jobs to the PBS queue. Usage:
manyqsub [filelist]
For example, I (Rob) used manyqsub analyze_61*.pbs to submit my PBS jobs for runs 61xx.

This script just calls qsub once for each file you list on the command line.

Peter Green's note on the PBS queueing system

There is a link there to "User Documentation" which contains an
overview of PBS.  It may or may not have much useful information,
though.

The "essence" of PBS is that you submit a shell script with the "qsub"
command.  This puts the job in a queue, from where it gets taken off and
executed as nodes become available.  A typical shell script looks like
(this is one I used for hermes a while ago)
#PBS -S /bin/bash
#PBS -q extend
#PBS -l nodes=any
#PBS -m be
#PBS -M pewg@phys.ualberta.ca
cd  /shift/shd01/pewg/hmcprod/prod
./startmcprod ../thorlog/bmc1
   
        Things which start #PBS simply define PBS characteristics needed for
the job.  The ones above mean:
-S - which shell you use
-q - which queue you want to submit to.  We have effectively 3 queues
        short - 1 hour max CPU time
        long - 1 day CPU time
        extend - infinite CPU time
-l - list of resources needed.  I have only ever used the "nodes"
resource, and always specify "any".
-m - mail options - "be" means send a message at Beginning and End of
job
-M - list of people to send the mail to

        After you've got you shell script(s) set up the way you want

qsub shell_script

should do it.

        There are man pages for all of this.  Start with "man qsub" and go from
there.

    


Konstantin Olchanski and Rob MacDonald
Last modified: Wed May 15 11:12:33 MDT 2002