Understanding indexing and how many thread there are in a block

I'm studying cuda programming and I've found that there are more than one way to indexing a grid.

What I don't understan is how those indexing tecnique are different between each other.

Those are my indexing:

1D grid of 1D blocks

``````th=blockIdx.x *blockDim.x + threadIdx.x;
``````

1D grid of 2D blocks

``````th=blockIdx.x * blockDim.x * blockDim.y + threadIdx.y * blockDim.x + threadIdx.x;
``````

1D grid of 3D blocks

``````th =blockIdx.x * blockDim.x * blockDim.y * blockDim.z + threadIdx.z * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;
``````

What is the advantage of using the 2nd type of indexing respect to the first one?

I've also problem in reading this information: "the maximum number of thread per block is 1024 and the max dimension size of thread block (x,y,z) is (1024,1024,64)" what does it mean that the blockdim.z is equal to 64? there are only 64 threads or 1024*64 ? what if I use all the direction? does the number of thread that I can use in a grid increase?

• As already answered: its just for your benefit. I work with 3D images, so arranging everything using 3D blocks of 3D threads is super convenient, as I can compute a thread `id` in each direction and use that for indexing – Ander Biguri Jun 7 at 10:17

Regarding the second question about the maximum number of threads. You can calculate the number of threads in a block with dimension (x,y,z) as `x*y*z`. The max number of allowed thread per block is `1024`. This means that you are allowed to use all the values you want for `x,y,z` provided that their product does not get higher than `1024` and that `x,y<=1024` and `z<=64`.