1 Reply Latest reply on Jan 15, 2015 3:22 PM by joe_intel

    Direct3D11 DXVA behavior issues on some Atom chipsets; audio/video synchronization lost in decoding


      In a a Windows Store App that our company is developing, we have countered a problem whereby if the user repeatedly adjusts the layout during playback of mpeg2 content, the synchronization between audio and video is lost. This lip-sync error persists even if the layout subsequently remains static. This problem is occurring on two devices, one with the Atom Z3735F chipset and the other with the Atom Z3795 chipset.


      When AVC content is played back, if a lip sync error develops, frames are dropped and synchronization between audio and video is stored. Proceeding with an investigation based on the assumption that our MPEG2 decoder must adjust based on the current playback quality, we have discovered the following things.


      1. Investigation into conditions of sync loss occurrence

      We deliberately delayed the clock to determine the conditions under which sync was lost.



      * If the video frame's PTS is delayed, the sync loss could be reproduced.

      If the PTS is delayed by 1s during playback, lip sync is lost and is not subsequently restored. Introducing a several second delay on the output of the decoder also reproduced the loss of sync. Finally, the problem was not reproduced using DirectX9 on a Desktop version, but it did occur with DirectX11.



      2. Investigation into what's causing the delay

      Logging clock times, we established what was consuming time when sync was lost.



      A) Sample requests from the renderer are late

      At first, ProcessOutput is called with an interval of 2-3ms to try to catch up. Slowly, the interval increases and even though the video still lags, the final interval is in the range of 30-40ms. For normal and correct playback, the request interval should be 33-34ms for the frame rate. Under normal playback, if synchronization slips, requests should come at intervals of 2-3ms until synchronization is restored.


      B) A Direct3D11 DXVA interface that is used internally by the decoder can take 31ms to complete

      When requests from the renderer begin to arrive late, ID3D11VideoContext::DecoderBeginFrame, which is called to move processing from the renderer to the decoder, takes as long as 31ms to complete. Ordinarily this call completes inside of one millisecond.


      3. Workarounds in the Decoder

      When DecoderBeginFrame() takes a long time to return, try reducing workload until sync is restored



      A) Drop a frame in the decoder

      If DecoderBeginFrame() takes more than 10ms, do not decode the current frame and do not pass data to the renderer. DecoderBeginFrame() subsequently does not take a long time, and lip sync is restored.

      B) Deactivate deinterlace processing

      If DecoderBeginFrame() takes more than 10ms, set the de-interlace flag to false for one frame before passing the data to the renderer. DecoderBeginFrame() subsequently does not take a long time, and lip sync is restored.



      4) In summary...

      A) Even when the video is lagging, the interval between requests for ProcessOutput increase. At first, when video is lagging and sync is lost between audio and video, multiple immediate requests for data are received by the decoder, but before synchronization can be restored, the request interval increases. We expect that the DXVA acceleration is used not just by the decoder, but also by the renderer, and hope that Intel can explain to us the driver mechanism that leads to the above behavior.

      B) DecoderBeginFrame() takes approximately 30ms.  When (A) occurs, DecoderBeginFrame() takes approximately 30ms. What internal mechanism is causing this delay?

      C) May we have documentation related to the above?

      D) Would it be possible for you to explain the mechanism which allows the Workaround (3) to address issues (1) and (2)? Additionally, please instruct how we can restore audio video synchronization without the workarounds of (3)?