From d5e45a08a8ccd3664cf1af78b1eafa87dfb5e6ec Mon Sep 17 00:00:00 2001
From: Erik Strand <erik.strand@cba.mit.edu>
Date: Mon, 1 Mar 2021 21:26:45 -0500
Subject: [PATCH] Update README

---
 README.md | 166 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 165 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 6b0bacc..caa4bfc 100644
--- a/README.md
+++ b/README.md
@@ -1,2 +1,166 @@
-# satori
+# Satori
 
+
+## Logging In
+
+Before logging in for the first time, you'll need to activate your account by following these
+[instructions](https://mit-satori.github.io/satori-getting-started.html#logging-in-to-satori).
+
+Now you can `ssh` in to either of the login nodes like this (replacing `strand` with your username).
+
+```
+ssh strand@satori-login-001.mit.edu
+ssh strand@satori-login-002.mit.edu
+```
+
+According to [this](https://mit-satori.github.io/satori-ssh.html), the first login node should be
+used for submitting jobs, and the second for compiling code or transferring large files. But it also
+says that if one isn't available, just try the other. Both have 160 cores.
+
+
+## Modules
+
+Satori is set up to use [Environment Modules](https://modules.readthedocs.io/en/latest/index.html)
+to control which executables, libraries, etc. are on your path(s). So you'll want to become familiar
+with the `module` command.
+
+- `module avail` lists all available modules
+- `module spider <module name>` gives you info about a module, including which other modules have to
+  to be loaded first
+- `module load <module name>` loads a specific module
+- `module list` shows all the currently loaded modules
+- `module unload <module name>` unloads a specific modeul
+- `module purge` unloads all modules
+
+Satori also uses [Spack](https://spack.io/) to manage versions of many tools, so generally speaking
+you should always have this module loaded: `module load spack`. If you run `module avail` before and
+after loading spack, you'll see that a lot more modules become visible.
+
+For compiling C/C++ and CUDA code, these are the modules I start with.
+
+```
+module load spack git cuda gcc/7.3.0 cmake
+```
+
+Note: I'd like to use gcc 8, but I get build errors when I use it.
+
+
+## Running Jobs
+
+Let's start with these simple CUDA [hello world](https://gitlab.cba.mit.edu/pub/hello-world/cuda)
+programs.
+
+With the modules above loaded, you should be able to clone the repo and build it. (The first time
+through, you probably want to do a little git
+[setup](https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup).)
+
+```
+git clone ssh://git@gitlab.cba.mit.edu:846/pub/hello-world/cuda.git
+cd cuda
+make -j
+```
+
+Since all these programs are very lightweight, I admit I tested them all on the login node directly.
+Running `get_gpu_info` in particular revealed that the login nodes each have two V100 GPUs. (The
+compute nodes have four.)
+
+But let's do things the right way, using [slurm](https://slurm.schedmd.com/overview.html). We'll
+start by making a submission script for `saxpy`. I called mine `saxpy.slurm`, and put it in its own
+directory outside the repo.
+
+```
+#!/bin/bash
+
+#SBATCH -J saxpy        # sets the job name
+#SBATCH -o saxpy_%j.out # determines the main output file (%j will be replaced with the job number)
+#SBATCH -e saxpy_%j.err # determines the error output file
+#SBATCH --mail-user=erik.strand@cba.mit.edu
+#SBATCH --mail-type=ALL
+#SBATCH --gres=gpu:1    # requests one GPU per node...
+#SBATCH --nodes=1       # and one node...
+#SBATCH --ntasks-per-node=1 # running only instance of our command.
+#SBATCH --mem=256M      # We ask for 256 megabyte of memory (plenty for our purposes)...
+#SBATCH --time=00:01:00 # and one minute of time (again, more than we really need).
+
+~/code/cuda/saxpy
+
+echo "Run completed at:"
+date
+```
+
+All the lines that start with `#SBATCH` are parsed by slurm to determine which resources you need.
+You can also pass these on the command line, but I like to put everything in a file so I don't
+forget what I asked for.
+
+To submit the job, run `sbatch saxpy.slurm`. Slurm will then tell you the job id.
+
+```
+[strand@satori-login-002 saxpy]$ sbatch saxpy.slurm
+Submitted batch job 61187
+```
+
+To query jobs in the queue, use `squeue`. If you run it with no arguments, you'll see all the queued
+jobs. To ask about a specific job, use `-j`. To ask about all jobs that you've submitted, use `-u`.
+
+```
+[strand@satori-login-002 saxpy]$ squeue -u strand
+             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
+             61187 sched_sys    saxpy   strand  R       0:00      1 node0023
+```
+
+Since we only asked for one minute of compute time, our job is started very quickly. So if you run
+`squeue` and don't see anything, it might just be because the job already finished.
+
+You'll know the job is finished when its output files appear. They should show up in the directory
+where you queued the job with `sbatch`.
+
+```
+[strand@satori-login-002 saxpy]$ cat saxpy_61187.out
+Performing SAXPY on vectors of dim 1048576
+CPU time: 323 microseconds
+GPU time: 59 microseconds
+Max error: 0
+Run completed at:
+Mon Mar  1 19:40:43 EST 2021
+```
+
+Now let's try submitting `saxpy_multi_gpu`, and giving it multiple GPUs. We can use basically the
+same batch script, just with the new executable and GPU count (i.e. `--gres=gpu:4`). It doesn't
+matter for this program, but for real work you may also want to add `#SBATCH --exclusive` to make
+sure you're not competing with other jobs on the same GPU.
+
+We submit the job in the same way: `sbatch saxpy_multi_gpu.slurm`. Soon after I had this output
+file.
+
+```
+Performing SAXPY on vectors of dim 1048576.
+Found 4 GPUs.
+
+CPU time: 579 microseconds
+GPU 0 time: 55 microseconds
+GPU 1 time: 85 microseconds
+GPU 2 time: 60 microseconds
+GPU 3 time: 61 microseconds
+
+GPU 0 max error: 0
+GPU 1 max error: 0
+GPU 2 max error: 0
+GPU 3 max error: 0
+
+Run completed at:
+Mon Mar  1 20:27:16 EST 2021
+```
+
+
+## TODO
+
+- MPI hello world
+- Interactive sessions
+
+
+## Questions
+
+- How can I load CUDA 11?
+- Why is gcc 8 broken?
+- Is there a module for cmake 3.19? If not, can I make one?
+- Is there a dedicated test queue?
-- 
GitLab