Info for new users of the cluster

The cluster consists of 4 nodes, each having 2 processors, equipped with 6 cores, giving 12 cores per node. In total there are 48 cores (or 96 hyperthreaded cores).

The main node is called octopus, the other ones are named t2, t3, t4. (t1 is identical to octopus).

Getting an account

Ask someone with superuser access to run the script

    /admin/setup-user

on the main node octopus. This will add the user, create a home directory, and copy that informations to the other nodes in the cluster. By default the password is set to the same as on ladybug.

Or send an email to jlidmar@kth.se to request an account. Then read the following:

Logging in for the first time

To log in on the main node run

     ssh -Y octopus

or perhaps

     ssh -Y octopus.theophys.kth.se

You can also log in to one of the other nodes t2, t3, or t4.

Changing password

If you want to change your password, run

    passwd

You need to do it manually for each node. Make sure to use the same password on every node, or else…

Generate SSH keys

Run

    ssh-keygen

Accept the default and leave the passphrase empty for simplicity. Copy your newly created key to .ssh/authorized_keys:

    cp .ssh/id_rsa.pub .ssh/authorized_keys

This will enable you to log in to the other nodes of the cluster without being prompted for a password.

Filesystem

The cluster runs a parallel filesystem called GlusterFS, which lets you access your home directory (/home/user) on each node.

There is also local storage on each node under /scratch. Please create your own directory in /scratch/your_username and put your files there if you want to use it.

Submit a job

The computer uses as queueing system called “gridengine”.

Write a script which starts the job. For example, put the following in a file named start-job:

    #!/bin/sh

    ./my_program arg1 arg2 > my_results

Submit the job:

    qsub start-job

Submit 12 jobs:

    qsub -t 1-12 start-job

To submit a job to one particular node, e.g., t3

    qsub -q all.q@t3 start-job

For more info see man qsub. To get info about queued and running jobs:

    qstat -f

Running MPI jobs

First compile your program with mpicc, mpic++, etc.

In order to submit a MPI job you first create a script (jobscript.sh for example) containing the following:

    #!/bin/sh
    mpirun /pathToExecutable arg1 arg2 ...

Please note that the arguments are optional.

Make sure that the long hostnames, including the .theophys.kth.se part, are listed in your .ssh/known_hosts file. This can most easily be fixed by sshing to the nodes:

    $ ssh yourusername@tX.theophys.kth.se

for X=1,2,3,4. You only need to do this step once. (If these are not in the .ssh/known_hosts list and you submit a job with more than 12 processes the job will fail).

Submit the job

To submit a job you have to specify which parallel environment to be used. The environment configured for mpi jobs on octopus is called orte. For example, to submit a job running on 15 cores you enter the command:

    $ qsub -pe orte 15 jobscript.sh

The environment orte has been configured to fill the nodes one at a time until a maximum of 48 processes across t1, t2, t3, t4.

Running matlab, maple, or mathematica

Run the followin commands in a terminal

    $ . /pkg/init/bash
    $ pkg math/matlab

Then you can start matlab as usual:

    $ matlab

Similar instructions work for maple and mathematica.