4 Replies Latest reply on Sep 29, 2011 3:57 PM by Kostrov

A problem related to floating point arithmetics on Intel's FPUs? Or, something else?

Hello everybody,

I detected some problem related to floating point arithmetics on Intel's FPUs ( Floating Point Unit ). Or, it could be

related to something else and I simply missed it.

In general, I'm very surprised with my results and I wonder if Intel's engineers have any idea what could be

possibly wrong.

Best regards,

Sergey

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Let's say two square 8x8 matrices need to be multiplied. Here are examples of these test matrices:

// Matrix A - 8x8 - 'float' type:

101.0   201.0  301.0   401.0   501.0   601.0  701.0   801.0
901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0

// Matrix B - 8x8 - 'float' type:

101.0   201.0  301.0   401.0   501.0   601.0  701.0   801.0
901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0

In two Test-Cases classic 3-Loop matrix multiplication algorithm is used with Rolled and

Unrolled Loops:

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Test-Case 1 - Rolled Loops - 1-in-1

for( i = 0; i < 8; i++ )
{
for( j = 0; j < 8; j++ )
{
fC[i][j] = 0.0f;
for( k = 0; k < 8; k++ )
{
fC[i][j] += ( fA[i][k] * fB[k][j] );
}
}
}

// Matrix C = Matrix A * Matrix B - 8x8 - 'float' type

13826808.0    14187608.0   14548408.0   14909208.0   15270008.0   15630808.0  15991608.0   16352408.0
32393208.0    33394008.0   34394808.0   35395608.0   36396408.0   37397208.0  38398008.0   39398808.0
>50959604.0<  52600404.0   54241204.0   55882004.0   57522804.0   59163604.0  60804404.0   62445204.0
69526008.0    71806808.0   74087608.0   76368408.0   78649208.0   80930008.0  83210808.0   85491608.0
88092408.0    91013208.0   93934008.0   96854808.0   99775608.0 102696408.0 105617208.0 108538008.0
106658808.0   110219608.0 113780408.0 117341208.0 120902008.0 124462808.0 128023608.0 131584408.0
125225208.0   129426008.0 133626808.0 137827616.0 142028400.0 146229216.0 150430000.0 154630816.0
143791600.0   148632416.0 153473200.0 158314016.0 163154800.0 167995616.0 172836416.0 177677200.0

I marked an incorrect value as '>50959604.0<'. By some reason it is NOT '50959608.0'. There are also a couple of more

incorrect values ( underlined ) because in all cases last two digits of any value in the Matrix C must be '08'.

It can't be '...00' or '...16'.

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Test-Case 2 - Unrolled Loops - 4-in-1

for( i = 0; i < 8; i++ )
{
for( j = 0; j < 8; j++ )
{
fC[i][j] = 0.0f;
for( k = 0; k < 8; k += 4 )
{
fC[i][j] += ( fA[i][k  ] * fB[k  ][j] ) +
( fA[i][k+1] * fB[k+1][j] ) +
( fA[i][k+2] * fB[k+2][j] ) +
( fA[i][k+3] * fB[k+3][j] );
}
}
}

// Matrix C = Matrix A * Matrix B - 8x8 - 'float' type

13826808.0    14187608.0   14548408.0   14909208.0   15270008.0   15630808.0  15991608.0   16352408.0
32393208.0    33394008.0   34394808.0   35395608.0   36396408.0   37397208.0  38398008.0   39398808.0
50959608.0    52600408.0   54241208.0   55882008.0   57522808.0   59163608.0  60804408.0   62445208.0
69526008.0    71806808.0   74087608.0   76368408.0   78649208.0   80930008.0  83210808.0   85491608.0
88092408.0    91013208.0   93934008.0   96854808.0   99775608.0 102696408.0 105617208.0 108538008.0
106658808.0   110219608.0 113780408.0 117341208.0 120902008.0 124462808.0 128023608.0 131584408.0
125225208.0   129426008.0 133626808.0 137827616.0 142028416.0 146229216.0 150430016.0 154630816.0
>143791616.0< 148632416.0 153473216.0 158314016.0 163154816.0 167995616.0 172836416.0 177677216.0

I marked an incorrect value as '>143791616.0<'. By some reason it is NOT '143791608.0'. There are also a couple of more

incorrect values ( underlined ) because in all cases last two digits of any value in the Matrix C must be '08'.

It can't be '...00' or '...16'.

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Test-Case 3 - Online Matrix Multiplication calculator:

// www.bluebit.gr/matrix-calculator/multiply.aspx

// Matrix C = Matrix A * Matrix B - 8x8 - 'float' type

13826808.0    14187608.0   14548408.0   14909208.0   15270008.0   15630808.0   15991608.0  16352408.0
32393208.0    33394008.0   34394808.0   35395608.0   36396408.0   37397208.0   38398008.0  39398808.0
50959608.0    52600408.0   54241208.0   55882008.0   57522808.0   59163608.0   60804408.0  62445208.0
69526008.0    71806808.0   74087608.0   76368408.0   78649208.0   80930008.0   83210808.0  85491608.0
88092408.0    91013208.0   93934008.0   96854808.0   99775608.0 102696408.0 105617208.0 108538008.0
106658808.0   110219608.0 113780408.0 117341208.0 120902008.0 124462808.0 128023608.0 131584408.0
125225208.0   129426008.0 133626808.0 137827608.0 142028408.0 146229208.0 150430008.0 154630808.0
143791608.0   148632408.0 153473208.0 158314008.0 163154808.0 167995608.0 172836408.0 177677208.0

As you can see the Matrix C has all correct values, that is, last two digits of any value in the Matrix C is '...08'.

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

So, if you're a C/C++ Software Developer could you verify results of matrix multiplication with the classic

algorithm using the same values for Matrices A and B? Personally, I'm VERY puzzled with my results.

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Here is additional information about a test computer where all calculations were done:

>> TechInfo - Part 1 <<

Operating system: Microsoft Windows XP Version 2002 SP3 ( Up-to-date regarding Microsoft's security updates )

>> TechInfo - Part 2 <<

Basic CPUID Information 0
CPUInfo[0] = 0x00000002
CPUInfo[1] = 0x756E6547
CPUInfo[2] = 0x6C65746E
CPUInfo[3] = 0x49656E69
Basic CPUID Information 1
CPUInfo[0] = 0x00000F12
CPUInfo[1] = 0x00010808
CPUInfo[2] = 0x00000000
CPUInfo[3] = 0x3FEBFBFF
Basic CPUID Information 2
CPUInfo[0] = 0x665B5001
CPUInfo[1] = 0x00000000
CPUInfo[2] = 0x00000000
CPUInfo[3] = 0x007A7040
Extended Function CPUID Information 80000000
CPUInfo[0] = 0x80000004
CPUInfo[1] = 0x00000000
CPUInfo[2] = 0x00000000
CPUInfo[3] = 0x00000000
Extended Function CPUID Information 80000001
CPUInfo[0] = 0x00000000
CPUInfo[1] = 0x00000000
CPUInfo[2] = 0x00000000
CPUInfo[3] = 0x00000000
Extended Function CPUID Information 80000002
CPUInfo[0] = 0x20202020
CPUInfo[1] = 0x20202020
CPUInfo[2] = 0x20202020
CPUInfo[3] = 0x6E492020
Extended Function CPUID Information 80000003
CPUInfo[0] = 0x286C6574
CPUInfo[1] = 0x50202952
CPUInfo[2] = 0x69746E65
CPUInfo[3] = 0x52286D75
Extended Function CPUID Information 80000004
CPUInfo[0] = 0x20342029
CPUInfo[1] = 0x20555043
CPUInfo[2] = 0x30362E31
CPUInfo[3] = 0x007A4847
CPU Brand String:               Intel(R) Pentium(R) 4 CPU 1.60GHz
CPU Vendor      : GenuineIntel
Stepping ID = 2
Model = 1
Family = 15
Brand Index = 8
CLFLUSH Cache Line Size = 64
The following features are supported:
FPU  - Floating Point Unit On Chip
VME  - Virtual 8086 Mode Enhancement
DE   - Debugging Extensions
PSE  - Page Size Extensions
TSC  - Time Stamp Counter
MSR  - Model Specific Registers RDMSR and WRMSR Instructions
MCE  - Machine Check Exception
CX8  - CMPXCHG8B Instruction
APIC - APIC On Chip
SEP  - SYSENTER and SYSEXIT
MTRR - Memory Type Range Registers
PGE  - PTE Global Bit
MCA  - Machine Check Architecture
CMOV - Conditional Move Instructions
PAT  - Page Attribute Table
PSE36- 36-bit Page Size Extension
CLFSH- CLFLUSH Instruction
DS   - Debug Store
ACPI - Thermal Monitor and Software Controlled Clock Facilities
MMX  - Intel MMX Technology
FXSR - FXSAVE and FXRSTOR Instructions
SSE  - SSE Extensions
SSE2 - SSE2 Extensions
SS   - Self Snoop
TM   - Thermal Monitor
Enchanced Intel SpeedStep Technology - Unsupported

>> TechInfo - Part 3 <<

CPU Vendor: GenuineIntel
HTT and Streaming SIMD Extensions features:
HTT      : 1
MMX    : 1
SSE2   : 1
SSE3   : 0
SSSE3 : 0
SSE4.1: 0
SSE4.2: 0

>> TechInfo - Part 4 <<

I reproduced the same problem on 3 computers with the following Intel's CPUs:

- Pentium 4
- Pentium II-300
- Atom N270

>> TechInfo - Part 5 <<

I reproduced the same problem with the following C++ compilers:

Microsoft Visual Studio 2005 ( Version 8.0.50727.762 SP.050727-7800 )

Borland C++ v5.5.1
MinGW v3.4.2 ( GNU C++ version 3.4.2 (mingw-special) )
Borland Turbo C++ Version 3.0.0

>> TechInfo - Part 6 <<

I reproduced the same problem with an application compiled by
Turbo C++ Version 3.00 for an MS-DOS operating system and executed
on the same computer.

Note: An MS-DOS operating system was loaded from a 3.5" floppy disk

• 1. Re: A problem related to floating point arithmetics on Intel's FPUs? Or, something else?

Sergey,

You used the float data type which is single precisioin which offers slightly more than six significant digits. You need double precision for the number of significants used in your example. In double precision, you will see correct results.

Robert Miller

• 2. Re: A problem related to floating point arithmetics on Intel's FPUs? Or, something else?

If I change to 32-bit 'int', for example, I will also get a right 'C Matrix':

13826808     14187608   14548408   14909208   15270008   15630808   15991608  16352408
32393208     33394008   34394808   35395608   36396408   37397208   38398008  39398808
50959608     52600408   54241208   55882008   57522808   59163608   60804408  62445208
69526008     71806808   74087608   76368408   78649208   80930008   83210808  85491608
88092408     91013208   93934008   96854808   99775608 102696408 105617208 108538008
106658808   110219608 113780408 117341208 120902008 124462808 128023608 131584408
125225208   129426008 133626808 137827608 142028408 146229208 150430008 154630808
143791608   148632408 153473208 158314008 163154808 167995608 172836408 177677208

This is not about changing a data type from 'float' to 'double'. I'll need to understand why precision is lost for

different values in the 'C' Matrix in case of 'Rolled' and 'Unrolled' loops.

In my Test-Cases I use 8x8 matrices, that is, very-very small matrices. In a real life a customer

could use a matrix with size up to 8192x8192. In case of declaration as 'double' an application will

use significantly more memory. Then, problem escalates because a Strassen's algorithm for matrix

multiplication needs to be used in order to speed up calculations, and it creates another impact on memory.

What I can see that with 'float' type I reached a limitation of precision defined in IEEE 754 standard.

I also looked at assembler codes generated by Visual C++ and I confirm FPU's register ST0 gets a '50959608.0' value ( 1st Test-Case ).

After that a variable fC[2][0] gets '50959604.0'.

• 3. Re: A problem related to floating point arithmetics on Intel's FPUs? Or, something else?

Your inputs are all integers, and floats can represent them exactly because they are less than or equal to 2^24-1 = 16777215. Your results are all integers, but they are too big to be represented exactly (in general) in floats. That's the first problem.

Your intermediate computations are being done in the FPU in double (or extended) precision (unless you changed the processor precision mode to 24-bit). As doubles, your results are exact integers. But intermediate computations can be "flushed" back to your float variables, at which point they will be rounded. How long your values are kept in the FPU before being flushed back depends on what instructions the compiler generates; certainly your different algorithms will generate different code -- and potentially different results. That's the second problem.

• 4. Re: A problem related to floating point arithmetics on Intel's FPUs? Or, something else?

Thank you to everybody who responded to my initial post.

I've just completed implementation of two prototypes for Precision Loss Detection ( PLD ) subsystem. Both

prototypes don't use exceptions, C++ or SEH, etc.

For example, for the 'float' data type detection works when a result is greater than '16777216.0':

4096.0 * 4096.0 = 16777216.0 - No Precision Loss   - No Detection

4097.0 * 4097.0 = 16785409.0 - Yes Precision Loss - Yes Detected

A first prototype is extremely simple and it "monitors" results of all calculations, additions or multiplications. It is absolutely portable,

but affects performance.

A second prototype uses CRT-functions 'signal(...)', 'setjmp(...)' and 'longjmp(...)'. I coded it as 'SSL'. But, some modern C++ compilers failed to fully support it and

it can't be used on the project. Here is some information:

MS Visual C++   : SSL based PLD didn't work

Borland C++ v5.x: SSL based PLD didn't work

MinGW v3.4.2    : SSL based PLD didn't work

Turbo C++ 3.x    : SSL based PLD worked absolutely fine!

PS: Unfortunately, due to project constraints C++ or SEH exceptions can't be used at all.