1. Write a CUDA kernel incrementing a float array A of size N for a 1D grid, using ID thread
blocks, and assuming that each thread increments one element.
}
Solution:
_global____ void increment_On_Device(float *A_d, int N)
{
Fig: 1