Profiling

Profiling on CUDA is simple and complicated at the same time. While it is not easy to profile over thousands of threads, NVIDIA provides a set of tools which profile, evaluate and plot the result. For this we will take a look at Nsight Compute and Nsight Systems.

For the workflow we will assume that the GPU-powered machine is remote and without a graphical user interface. Both tools provide the possibility to run on the command line, output a file and use this file on a different machine with graphical user interface. This approach would be a typical workflow for working on MOGON.