2 Replies Latest reply on Apr 14, 2016 2:58 PM by Intel Corporation

    OpenCL optimization with HD Graphics 4600


      I wrote an OpenCL program running on Intel HD Graphics 4600 processor graphics. The number of compute unit is 20 by query clGetDeviceInfo. The work items within a  work group run in a compute unit, there are 20 compute units, so more than 20 work groups can be active on 4600. But there is no information about processing elements in each compute unit. If the work group size is 256, which means there are 256 threads in each work group. How do these threads execute in each compute unit. From the performance perspective, 64, 128, 256, which work group size is preferred. Thanks.