1,001 Ways to Write CUDA Kernels in Python – NVIDIA