Friday, March 18, 2011

Float like an approximation; stings like a bee

I distinctly remember having to tell my CIO and a VP of Finances in 2005 that the reason their reports were off by one cent from the ledger was because the person who designed their reporting database had chosen to use the float datatype instead of the decimal datatype.  The implications of that one cent variation were astounding to me, but I wasn't a business finances guy.

This may read like a DailyWTF, and it is absolutely true.  That one cent variation echoed up through federal reporting and ledger validation, through accounting, compliance, legal and IT.  The problem was brought to light when a customer asked us to explain the difference.  It went all the way through the bureaucracy to a low-level app dev (me) to finally explain it.

I don't believe that many developers know that the float datatype (and also real) in SQL Server provides only an approximation of a number at certain levels of precision.   Here's a quick demo I used recently to explain.

create table floattest
( id integer not null identity(1,1) primary key
,    float24 float(24)
,    float53 float(53)
,    real1 real
,    decimal9 decimal(19,9)
)
go
insert into floattest (float24, float53, real1, decimal9)
Values (12345.12345,12345.12345,12345.12345,12345.12345)
insert into floattest (float24, float53, real1, decimal9)
Values (1234567890.12345,1234567890.12345,1234567890.12345,1234567890.12345)
insert into floattest (float24, float53, real1, decimal9)
Values (123456789.123456789,123456789.123456789,123456789.123456789,123456789.123456789)
go
select  float24, float53, real1, decimal9 from floattest


float24
float53
real1
decimal9
12345.12
12345.12345
12345.12
12345.123450000
1.234568E+09
1234567890.12345
1.234568E+09
1234567890.123450000
1.234568E+08
123456789.123457
1.234568E+08
123456789.123456789


Note the loss of precision in both float precision levels, and in real.

The per-row value storage requirements are:
Float(24): 4 bytes
Float(53): 8 bytes
real: 4 bytes
decimal(9,n): 5 bytes
decimal(19,n): 9 bytes

Considering what it as stake in certain industries, that one byte of savings for high-precision number storage can be very costly.

Make this a lesson learned, and just use decimal.

Interested to hear any feedback on when float makes sense, other than the desire to save a few bytes here or there.  

2 comments:

Anonymous said...

Nice...simple, clear, informative, and useful

Anonymous said...

You have only presented half the picture. Yes decimal(19, 9) is more precise than float53. That is clear from the microsoft documentation. What you have not presented is the precision of arithmetic with these data types. If you were to divide and multiply a decimal(19, 9) by some numbers, you would not come to the same conclusion.