Today’s MKI cluster consists of over 1,000 CPU cores with 2-4GB of memory per core. A petabyte of storage space is available on six storage servers. The cluster has seen several major updates to its operating system and workload scheduler. We are contemplating another major upgrade, so the following description is likely to change.
Once you have a cluster account, you need to know how to access the cluster from your desktop or laptop (referred to as your “local” machine). Access is restricted to applications that use SSH protocols.
Linux and OSX (Mac) operating systems are usually distributed with an SSH client included and you’ll
Before you can log in, you will need to create a pair of ssh keys, one private and one public, unless you already have them. To create ssh keys, type the following:
ssh-keygen -t rsa
You’ll be promoted to specify where the key files are to be written, and to supply a “pass-phrase”, one or more words that will become your password into the cluster. Do not forget this pass-phrase: if you do, you won’t be able to use the keys and you’ll have to create and register a new pair! The destination can be any “private” directory in your local computer, i.e., one that only you can read. The default is the “.ssh” sub-directory in your home directory, or “~/.ssh” for short. If you don’t have one, here’s how to create it:
mkdir ~/.ssh ; chmod 700 ~/.ssh
Once you have created the key files, email the public key file – the one whose name ends “.pub” – to Paul Hsi <firstname.lastname@example.org>. Once he has installed it in the cluster, you’ll be ready to log into the login node, e.g.,
If your local user name is the same as the one you use on the cluster, you can just type “ssh antares.mit.edu”. Also if your ssh key is not in “~/.ssh” on your local computer, you’ll need to specify its location, i.e.:
ssh <username> -i <ssh_key_file> <username>@antares.mit.edu
Once you receive confirmation that your public key has been installed on the cluster, start PuTTY and observe the startup window:
PuTTY will save this information in its “Saved Sessions” list, so the next time you log in, just select that session name and click the “Load” button to its right.
Most Linux systems come with X11 already installed and you use it to create your text windows. Some Mac systems don’t and you’ll have to install an X11 application. We recommend XQuartz which is available from https://www.xquartz.org/releases/. At the present time (summer 2020), the latest stable version is 2.7.11 which you can download from https://dl.bintray.com/xquartz/downloads/XQuartz-2.7.11.dmg. Once loaded, double-click the icon and follow the instructions.
With X11 is installed on your local computer, you must also add an -X or -Y flag to your login commands before you’ll be able to display graphics from the cluster, e.g.,
ssh -X <username>@antares.mit.edu
Both -X and -Y tell ssh to expect to receive instructions to display graphical windows on your home screen. The choice of flag depends on how your local ssh is configured. If you get the following error while running a graphics command, e.g., xterm, from the cluster:
try logging in with the other flag and, if this doesn’t work, check that the following line is present in your local /etc/ssh/sshd_config:
If it is missing, either add it (but you’ll need root permission to edit this file), or add it to your local ~/.ssh/config. In the latter case, that file should also contain the following:
If there is no “~/.ssh/config”, create it with a text editor and insert those three lines. Your graphics applications should now display from the cluster.
The MKI cluster uses the SLURM job scheduler and all jobs that run on the cluster’s compute nodes must be submitted through this scheduler. If a job is not scheduled via SLURM’s sbatch or srun commands, it will run on the login node and is liable to be killed without warning since it will be interfering with other users.
These are submitted through the sbatch command, an example of which is provided below. Create a command file containing special “#SBATCH” instructions for SLURM, followed by ordinary UNIX “shell” commands that start the job. This example will run myProgram using 2 compute nodes with a total of 24 CPUs and 45MB RAM for no longer than 35 minutes:
More information about sbatch and its “#SBATCH” directives is available from https://slurm.schedmd.com/sbatch.html
To find out whether your job can be run by sbatch or requires the added refinements of srun, consult https://slurm.schedmd.com/heterogeneous_jobs.html. Jobs started by sbatch leave the choice of assigning tasks to cores and nodes to SLURM. The srun command allows the user more control in assignments, if some parts of the job require more resources than others. srun can be executed from the command line, as in the following example:
srun -N2 -n24 -t=0:01:00 —p=<name> -l <hostname> <myProgram>
The command line options are equivalent to those in #SBATCH directives. Alternatively, multiple srun commands can be included in a single sbatch command file. More information about srun and its many options and environment variables is available from https://slurm.schedmd.com/srun.html.
– Show which partitions are available to you
– Display the status of your jobs
squeue -u <username>
– Display the status of a specific job
scontrol show job <jobid>
– Display the current cluster usage
If you are configured to receive X11 graphics (see above), here’s what you’ll see:
If you are familiar with other schedulers but not with SLURM, here is a comparison of some popular schedulers to help you with your transition. The SLURM Workload Manger is fully documented in https://slurm.schedmd.com/documentation.html, and the 2-page summary of sbatch and srun options should help.
Your data can be stored remotely on the clusters in a high-performance filesystem, but to copy the files to and from your local computer, you must run a program that uses SSH protocols, e.g., scp or sftp on Linux or MaxOS systems, WinSCP on Windows, or Cyberduck on MacOS.
In Windows, there is no built-in client for SSH transfer. We recommend you use WinSCP, which offers an easy graphical interface for copying files to and from your local desktop.
To copy test.txt to your home directory on the cluster, type the following command on your local machine:
scp test.txt <username>@antares.mit.edu:
To copy test.txt in the reverse direction:
scp <username>@antares.mit.edu:test.txt .
Please refer to the scp man page for more options.
The instructions above for scp will also work from a window of the terminal app on MacOSX systems. An alternative is to use a file transfer application such as Cyberduck.