Monday 10 July 2017

The Basics - Float or Dec?

This is my blog on databases, and as such I try and it keep it confined to IT and business. But there are parallels to my other job as a martial arts coach.

In jiu-jitsu, if you do not have solid basics, then you will lose. All the flashy moves in the world won't help if you don't know the finish.

And it's the same in the world of databases. I've worked on a few systems where the front end had all sorts of neat wizardry, but the engine was terrible. Here is one such example:

A financial controller came to be one day asking why some of her reports were out by one cent. Everything added up, but the final figure on the report was $0.01!

After looking at the database definitions, I saw the obvious (to me). The original designer had created the currency fields as floating point decimals!

And sadly, this is common. Because it's so easy to create a table (in for example Microsoft SQL server), then people who have not really grasped the intricacies of data types will do it, and quite often get it wrong.

Here is a brief example to demonstrate the issue of using Floats compared to Decimals:

(Copy paste the following into SSMS)

DECLARE @Num1 decimal(10,2), @Num2 decimal(10,2), @Num3 decimal(10,2);
SET @Num1 = 54; 
SET @Num2 = 3.1; 
SET @Num3 = 0 + @Num1 + @Num2; 
SELECT @Num3 AS TotalAsDecimal
SELECT @Num3 - @Num1 - @Num2 AS "Should be 0";

GO

DECLARE @Num1 float, @Num2 float, @Num3 float;
SET @Num1 = 54; 
SET @Num2 = 3.1; 
SET @Num3 = 0 + @Num1 + @Num2; 
SELECT @Num3 AS TotalAsFloat
SELECT @Num3 - @Num1 - @Num2 AS "Should be 0";

Both these two blocks do the same thing, the only difference is in the datatypes.

Here are the results:

To understand why this simple mathematical example returns a "wrong" answer, I suggest you watch this great video put out by Computerphile. The presenter does a great job of explaining how floating point numbers are stored on disk, and how this error actually happens.



So, until next time .... study the basics, and get them right! If you get the basics wrong, it will be an expensive problem to fix later down the road.

Saturday 8 April 2017

Dark Theme for SSMS

Do you work with SQL Server at night?

If you use Microsoft Visual Studio, then you're no doubt familiar with their dark theme. And if you're like me, then it's far easier on the eyes, especially at night.

But for some reason, Microsoft haven't released a dark theme for SQL Server Management Studio. Why not? Well apparently it's coming "in the future".

There is however a way you can change the default settings and make your own. It's a bit time consuming though.

So here's the good news ... I've created one and you can use it!

Here is how it looks:

dark theme 3
What do you think?

If you like it, then simply go to my website here:  http://stingrae.com.au/dark-theme-for-microsoft-sql-management-studio.aspx 
and download it for free. There are instructions on the page for you to follow to install it.

And if you don't like any of the colours, it's easy to change.

Enjoy!

Friday 10 March 2017

What data to store?

Sometimes the smallest mistakes can have serious side effects. And one I see very often is the mistake of storing data which can be calculated.

Here is an example of one I saw just this week in an accounting system. Can you spot the problem?

(I've removed some fields to just focus on the issue)

tblInvoiceHeader             tblInvoiceLines
InvoiceID InvoiceID
InvoiceName LineDesc
InvoiceAmount LineAmount
InvoiceTax LineTax
InvoiceTotal LineTotal

So here we have two tables - one for the Invoice Header, and one for the Invoice Lines.

And here is an example of some data. Let's assume it's a meal at a fast food restaurant or cafe.

  InvoiceID   InvoiceName   InvoiceAmount   InvoiceTax   InvoiceTotal
  100  Fred Smith       20.00     2.00      22.00

  InvoiceID   LineDesc   LineAmount   LineTax   LineTotal
   1   Hamburger      13.00     1.30    14.30
   2   Chips      4.00     0.40     4.40
   3   Drink      3.00          0.30     3.30

As you can see the lines all add up to give the total.

But ...

What if we need to change one value? Maybe the chips were meant to be $5 not $4?

Because of the way the data is stored, we need to alter the data in the LineAmount field ... AND the LineTax field ... AND the LineTotal field!

And that's not all ... !

We also now need to save the Invoice Header too! One change has completely thrown out all the data!

How should it be stored?

Well, depending on how often tax rates change, you could make an argument for keeping separate columns for the LineAmount and LineTax data. Especially in Australia where items such as food do not attract GST. But definitely the LineTotal should go.

And as for the InvoiceHeader table ... No need to have the totals on there either as a quick calculation can handle that.

This is how I recommend storing invoice data:

tblInvoiceHeader             tblInvoiceLines
InvoiceID InvoiceID
InvoiceName LineDesc
LineAmount
LineTax

Unfortunately for the poor financial controller who has this data integrity problem, there is nothing easy can do except make sure that if an Invoice Line Item is changed, then the Tax is changed, and the Line Total is changed, and the Header record is changed too.

The lesson from this is to never store data which can be calculated from other data in the same, or related record(s). 

Wednesday 22 February 2017

Sargable Queries in SQL Server (with example)

Hi,

Recently I was reviewing a SQL stored procedure for a company and I noticed that the developers had made an all too common error. They created their WHERE clause with a non “sargable” condition.

(For information on the term sargable see this link - https://en.wikipedia.org/wiki/Sargable)

What this caused was a rather painful table scan because the query optimiser was unable to use the indexes on the table.

Let’s look at a pretty simple example, and one that you can run yourself.

We’ll begin with creating a table with three fields:

CREATE TABLE tblExampleA (ID INT IDENTITY (1,1) , Word1 VARCHAR(10) , Date1 DATETIME);

Next, we need to put some data in there. This SQL below will throw in about 50,000 rows:

DECLARE @D1 DATETIME = DATEADD(YEAR , -50, GETDATE());

WHILE @D1 < GETDATE()
BEGIN
INSERT INTO tblExampleA ( Word1, Date1 )
VALUES (DATENAME( WEEKDAY , @D1) , @D1);

SET @D1 = DATEADD(HOUR , 9 , @D1);

PRINT @D1;
END

SELECT * FROM tblExampleA;

To illustrate the issue, we want to show the difference between an Index Scan and an Index Seek. So let’s go ahead and put a clustered index on our table, and ask for the SSMS to give us the statistics of the queries we're about to run:

CREATE CLUSTERED INDEX ciByDate ON tblExampleA (Date1);

SET STATISTICS IO ON;

Finally, we’ll run two SQL statements:

DECLARE @ForDate DATE = '1978-04-28';

SELECT Top 100 *
FROM tblExampleA
WHERE CONVERT(varchar,Date1,112) = @ForDate;

SELECT Top 100 *
FROM tblExampleA
WHERE Date1 BETWEEN @ForDate AND DATEADD(day,1,@ForDate);

When we look at the end results, the output is the same.


However … the work done by SQL to get the data is very different. Here are the two execution plans:


And the statistics we asked for:


Look at the difference in Logical reads.

Because of the CONVERT function over the Date1 field, the SQL optimiser was unable to use the index that we created. This is because it had to read all of the records in the table, parse them through the function and then compare them to our variable of @ForDate.
This was easily fixed, by instead of adding a function on the Date1 field to drop the Time off the DateTime field, we instead just used a different way of filtering. In this case, we used the BETWEEN condition to go from the DATE which we declared, and the Day after.

The best advice I can give when is to always test your queries, not just for the desired results, but for their optimisation. So often I see SQL code written by .Net developers, and while it returns the output they want, it is done in an most inefficient manner.

If you have any questions, I'd love to hear them so please send them through.

Rodney

Sunday 19 February 2017

Welcome

Hello,

I'm finally getting organised enough to share my tips on creating the best database possible.

This year is my 20th year in the IT industry, and after having started out as a junior AS/400 Developer, I've worked with (in order), Oracle, Microsoft Access and then while working in Rio de Janeiro of all places I finally got introduced to Microsoft SQL Server in 2005.

Since then, I've worked almost exclusively on SQL Server and have had the chance to be mentored by some great teachers, including Mr Victor Isokov.

Be sure to subscribe and check back as often as you can. I can't say how often I'll update this blog, but will aim to do it as often as I find things, or think of things which will be valuable to the everyday user of SQL Server.

Cheers,

Rodney