- 19 Mar, 2021 27 commits
-
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
I think they need to live in their own file, since the main application has to be compiled with mpicc, not nvcc.
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
It's 40. This is surprising given that each node has 160 CPUs.
-
Erik Strand authored
Now the calculation runs for about 2 seconds. Previously it was a fraction of a second, so GFLOPS between runs varied wildly.
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
I get 1600 GFLOPS on a single GPU (V100) as expected.
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
This seems to fix things.
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
This gives you a unique index within a single node. Will be useful for assigning GPUs to threads.
-
Erik Strand authored
-
- 18 Mar, 2021 1 commit
-
-
Erik Strand authored
-
- 02 Mar, 2021 3 commits
-
-
Erik Strand authored
-
Erik Strand authored
-
Erik Strand authored
-