4 Replies Latest reply on Dec 5, 2011 6:14 PM by mwvantol

    Floating point operation

    junghyun

      Is there anyone playing with doubles?

       

      I posted an article about floating point operation before.

       

      However, it is not solved clearly.

       

       

      Thesedays, I'm porting some applications onto my runtime system.

       

      Unfortunately, an application does not produce the same result as another x86 multicore machine.

       

      I investigated it deeply, and I found one thing. it produces some different result.

       

      I printed the values with printf("%e"), because printf("%lf") shows the same result.

       

      I attached the vimdiff screen.(diff1.png)

       

       

       

      After google search, I found one article : http://www.network-theory.co.uk/docs/gccintro/gccintro_70.html

       

      I applied the technique, the values are the same now.

       

      Can somebody explain why?

       

      Note that I run the application on a single core, not multithreaded.

       

       

      My x86 machine has gcc 4.1.2 and Intel Xeon X5660.

       

      Is it a compiler bug or hardware bug?

       

      It could be possible newer libc has the technique inside.

       

       

      Ted, can you give me some information that the FPU in the SCC is the same as recent processors in terms of general floating point operations?

        • 1. Re: Floating point operation
          tedk

          If I'm understanding this correctly I think the differences you see are minor and due to how the numbers are calculated internally.

          I notice that on my MCPC, the floating point control register returns 0x37f. And I see 0x37f on the SCC core as well.

           

          What technique are you referring to?  The link refers to the Brian Gough book (I actually own this book). And I think what you are doing is forcing the fp control register to be 0x27f, which means that the internal calculations are performed in double precision.

           

          I think sometimes an OS or even the compiler may set  the FP control word for you. And if in one case it is set to 0x27f and  in another to 0x37f, ou will see different values.

           

          Bits 8 and 9 of the fp control word determine the precision control .. this is the internal precision of the significand during arithmetic operations. The lower 6 bits enable exceptions.

           

          1  1  1  1  1  1
          5  4  3  2  1  0  9  8  7  6  5  4  3  2  1  0
          ----------------------------------------------
          0  0  0  0  0  0  1  1  0  1  1  1  1  1  1  1

           

          11 is extended precision ( a 64-bit significand) and so internal calculations are performed in that 80-bit extended precision mode (sign:1; expoent:15 significand:64). So if you are returning a double, then the internal caculations are done in a higher precision and then the rounding occurs just when the double is returned.The floating point stack has 80-bit numbers.

           

          On the other hand if your floating point control word is set to 0x27f then the internal precision is double. Values are 64 bits (sign:1; exponent:11; significand: 52). Not that this is sometimes called 53 bit precision because the integer bit is implicit in double format (it's always 1 so it is not stored). In extended precision mode, the integer bit is explicit. And you most likely will see different results. When the control word is 0x37f, the result is more accurate.

           

          I can read the fp control word on my MCPC as follows (using gcc and that strange (in my opinion) gnu asm syntax)

          #include <stdio.h>
          main()
          {
          short cw = 0;
          __asm__("fstcw %w0" : "=m" (cw));
          printf("cw: %x \n",cw);
          }

           

          And then on the SCC core as follows (using icc and the more logical (in my opinion) MASM syntax)

           

          main()
          {
          short scc_fpcw=0;
                 scc_fpcw = get_fnstcw();
                 printf("scc_fpcw: %x\n",scc_fpcw);
          }
          int get_fnstcw(void)
          {
                  short myfpcw  =0;
                  __asm  { fnstcw myfpcw }
                 return myfpcw;
          }

           

          I don't think there is a hw or a compiler bug. I think this is the way it works.

           

          Message was edited by: Ted Kubaska corrected typo

          • 2. Re: Floating point operation
            junghyun

            Yes, I undestand what you say.

             

            It could be a minor difference, but I am curious what makes the difference.

             

            Is this because of the difference of FPU?

            • 3. Re: Floating point operation
              tedk

              The difference is due to how the FPU is configured ... that is, what you set the precision control to in the floating point control register. And you can configuire them the same way. What I tried to show in my last post was how to read that control register and see what it is set to. However, you have to be careful because some compilers will set it for you, and it may set it to something you do not expect. But you can explicitly set it to a known value before you do you floating point caculations. It think that's what you did with the link you posted. That Brian Gough book you linked to is a very good one, but you can get more information about the Pentium floating point control register in the orginal Pentium manuals or in a older book call Numerical Programming the 387, 486, and Pentium by Sanchez and Canton.

               

              You might look into whether your calculations are using a FMA instruction ... the fused multiply add instruction. With an FMA the rounding occurs after a multiply and an add. Without the FMA the rounding occurs after the multiply and then again after the add with a slightly different final result.

              1 of 1 people found this helpful
              • 4. Re: Floating point operation
                mwvantol

                junghyun wrote:

                 

                It could be a minor difference, but I am curious what makes the difference.

                 

                Is this because of the difference of FPU?

                 

                Well, what I gathered from the link that you dug up, is that it differs between operating systems how the x87 FPU is setup up by default (i.e. *BSD sets it in compatible rounding mode by default). Another possibility is that as your other machine is a recent architecture, the compiler decided to use SSE instructions instead of x87, which always use the compatible rounding mode according to the page that you referred to.