why is ceil used here and what purpose does it serve

Question

I was looking at a game of life gpu code and could not understand why is ceil used for

  dim3 cpyBlockSize(BLOCK_SIZE,1,1);

  dim3cpysimulationRowssimulationSize((int) ceil (size/(float) cpyBlockSize.x), 1, 1);
  dim3 cpysimulationColssimulationSize((int) ceil ((size+2) / (float) cpyBlockSize.x), 1, 1);

`ceil` is rounding up here. It guarantees that there are enough blocks (and therefore enough threads) to cover the entire working set. This is a pretty basic CUDA concept, so you will find many descriptions of this rounding up approach when choosing the number of CUDA blocks to launch. [Here](https://stackoverflow.com/questions/26217294/should-i-check-the-number-of-threads-in-kernel-code/26217725#26217725) is one example write-up. — Robert Crovella, Apr 22 '19 at 22:50

score 0 · Answer 1 · edited Apr 23 '19 at 07:12

Hard to tell without much context but this is my best guess:

Integer division rounds down but it appears the required behavior is to have (size/(float) cpyBlockSize.x) rounded up. So they cast cpyBlockSize.x to a float so the result of the division is a decimal, then rounded that decimal up using ceil to achieve the expected behavior.

For example in most* programming languages (C included):

> print (5/2)
> 2

> print (5/(float)2)
> 2.5

> print (ceil(5/(float)2))
> 3

why is ceil used here and what purpose does it serve

1 Answers1