I'm studying cuda programming and I've found that there are more than one way to indexing a grid.

What I don't understan is how those indexing tecnique are different between each other.

Those are my indexing:

1D grid of 1D blocks

th=blockIdx.x *blockDim.x + threadIdx.x;

1D grid of 2D blocks

th=blockIdx.x * blockDim.x * blockDim.y + threadIdx.y * blockDim.x + threadIdx.x;

1D grid of 3D blocks

th =blockIdx.x * blockDim.x * blockDim.y * blockDim.z + threadIdx.z * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;

What is the advantage of using the 2nd type of indexing respect to the first one?

I've also problem in reading this information: "the maximum number of thread per block is 1024 and the max dimension size of thread block (x,y,z) is (1024,1024,64)" what does it mean that the blockdim.z is equal to 64? there are only 64 threads or 1024*64 ? what if I use all the direction? does the number of thread that I can use in a grid increase?

  • As already answered: its just for your benefit. I work with 3D images, so arranging everything using 3D blocks of 3D threads is super convenient, as I can compute a thread id in each direction and use that for indexing – Ander Biguri Jun 7 at 10:17

The indexing reflect how you want to logically partition your data among your threads. If you are dealing with a 1D problem (imagine calculating the sum of two vectors), then you agree with me that it is much more easy to use a 1D decomposition so you can easily map one thread to a pair of elements from the two input arrays.

If you are working on a 2D and 3D structure like a matrix, the same argument applies.

Regarding the second question about the maximum number of threads. You can calculate the number of threads in a block with dimension (x,y,z) as x*y*z. The max number of allowed thread per block is 1024. This means that you are allowed to use all the values you want for x,y,z provided that their product does not get higher than 1024 and that x,y<=1024 and z<=64.

  • I'm looking now at: Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) reading this and the other things in the question does it means that I can create a unique grid that have a total number of blocks = 2147483647 (always considering xyz<=2147483647)? – Ofey Jun 7 at 14:35

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.