3 Replies Latest reply on Apr 5, 2018 10:15 AM by Intel Corporation

    Instruction specification error in "Intel 64 and IA-32 Architectures Software Developer's Manual"  Vol 2

    sdasgup3

      Bug Report 1: vpsravd %xmm3, %xmm2, %xmm1

       

      Semantics as per Intel Manual

       

      %ymm1  : 0x0₁₂₈ ∘ ((%ymm2[127:96] sign_shift_right (0x0₂₇ ∘ %ymm3[100:96])) ∘

                        ((%ymm2[95:64] sign_shift_right %ymm3[95:64]) ∘

                        ((%ymm2[63:32] sign_shift_right %ymm3[63:32]) ∘

                        (%ymm2[31:0] sign_shift_right %ymm3[31:0]))))

       

       

      ** ∘ is the concatenate symbol here.

       

      Note that the first term ((%ymm2[127:96] sign_shift_right (0x0₂₇ ∘ %ymm3[100:96])) has only 5 bits selected from  '%ymm3'.

      But the actual execution behaviour seem to expect  32 bits from %ymm3, i.e.,  ((%ymm2[127:96] sign_shift_right 0x0₂₇ ∘ %ymm3[127:96])

       

      The following is the pseudo code from manual

       

       

      VPSRAVD (VEX.128 version)

      COUNT_0 = SRC2[31 : 0]

      (* Repeat Each COUNT_i for the 2nd through 4th dwords of SRC2*)

      COUNT_3 = SRC2[100 : 96]; //<------------------------------------- Possibly a bug

      DEST[31:0] = SignExtend(SRC1[31:0] >> COUNT_0);

      (* Repeat shift operation for 2nd through 4th dwords *)

      DEST[127:96] = SignExtend(SRC1[127:96] >> COUNT_3);

      DEST[MAXVL-1:128] = 0;

       

       

      I am expecting the above bold portion to be a bug and should be SRC2[127 : 96]

       

      Test Input (hex)

       

      %ymm2:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  - **80 00 00 00** - 00 00 00 00 - 00 00 00 00 - 00 00 00 00

      %ymm3:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  - **00 00 00 20** - 00 00 00 00  - 00 00 00 00  - 00 00 00 00

      As per the manual, we should select just 5 bits from  `00 00 00 20`, where as hardware execution semantics require all 32 bits.

       

       

      Output as per manual

       

       

        0x0₁₂₈ ∘ ((0x80000000₃₂ sign_shift_right 0x0₃₂) ∘

                ((0x0₃₂ sign_shift_right 0x0₃₂) ∘

                ((0x0₃₂ sign_shift_right 0x0₃₂) ∘

                (0x0₃₂ sign_shift_right 0x0₃₂))))

       

       

      Output as per actual Hardware

       

       

          0x0₆₄ ∘ 0x0₆₄ ∘ 0xffffffff00000000₆₄ ∘ 0x0₆₄

       

       

       

      Bug Report 2: packsswb

      There seems to be bug in the descriptive text

      If the

      signed doubleword value is beyond the range of an unsigned word (i.e. greater than 7FH or less than 80H), ...

       

      In my opinion, the description must say range of signed word insead.