I'm studying cuda programming and I've found that there are more than one way to indexing a grid.

What I don't understan is how those indexing tecnique are different between each other.

Those are my indexing:

1D grid of 1D blocks

```
th=blockIdx.x *blockDim.x + threadIdx.x;
```

1D grid of 2D blocks

```
th=blockIdx.x * blockDim.x * blockDim.y + threadIdx.y * blockDim.x + threadIdx.x;
```

1D grid of 3D blocks

```
th =blockIdx.x * blockDim.x * blockDim.y * blockDim.z + threadIdx.z * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;
```

What is the advantage of using the 2nd type of indexing respect to the first one?

I've also problem in reading this information: "the maximum number of thread per block is 1024 and the max dimension size of thread block (x,y,z) is (1024,1024,64)" what does it mean that the blockdim.z is equal to 64? there are only 64 threads or 1024*64 ? what if I use all the direction? does the number of thread that I can use in a grid increase?

`id`

in each direction and use that for indexing – Ander Biguri Jun 7 at 10:17