2 Replies Latest reply on Apr 19, 2018 11:24 PM by Intel Corporation

    ResourceExhaustedError inTensorFlow


      I am trying to allocate a [100,3000,32] tensor but I get this error:


      ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[100,3000,32] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu

      [[Node: training/gradients/conv1/bias_relu/GatherV2_774_grad/transpose_1 = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](training/gradients/conv1/bias_relu/GatherV2_774_grad/UnsortedSegmentSum, training/gradients/conv1/bias_relu/GatherV2_2999_grad/concat_1)]]

      Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


      Is there any way to avoid this? I have tried to submit the job with larger memory limits but it still crashes whenever the last dimension is bigger than ~2000.


      Also, this dimension is not related to the number of training samples, and i am already training in batches.


      Thank you,


      -Luana Ruiz