Friday, 10 March 2017

What data to store?

Sometimes the smallest mistakes can have serious side effects. And one I see very often is the mistake of storing data which can be calculated.

Here is an example of one I saw just this week in an accounting system. Can you spot the problem?

(I've removed some fields to just focus on the issue)

tblInvoiceHeader             tblInvoiceLines
InvoiceID InvoiceID
InvoiceName LineDesc
InvoiceAmount LineAmount
InvoiceTax LineTax
InvoiceTotal LineTotal

So here we have two tables - one for the Invoice Header, and one for the Invoice Lines.

And here is an example of some data. Let's assume it's a meal at a fast food restaurant or cafe.

  InvoiceID   InvoiceName   InvoiceAmount   InvoiceTax   InvoiceTotal
  100  Fred Smith       20.00     2.00      22.00

  InvoiceID   LineDesc   LineAmount   LineTax   LineTotal
   1   Hamburger      13.00     1.30    14.30
   2   Chips      4.00     0.40     4.40
   3   Drink      3.00          0.30     3.30

As you can see the lines all add up to give the total.

But ...

What if we need to change one value? Maybe the chips were meant to be $5 not $4?

Because of the way the data is stored, we need to alter the data in the LineAmount field ... AND the LineTax field ... AND the LineTotal field!

And that's not all ... !

We also now need to save the Invoice Header too! One change has completely thrown out all the data!

How should it be stored?

Well, depending on how often tax rates change, you could make an argument for keeping separate columns for the LineAmount and LineTax data. Especially in Australia where items such as food do not attract GST. But definitely the LineTotal should go.

And as for the InvoiceHeader table ... No need to have the totals on there either as a quick calculation can handle that.

This is how I recommend storing invoice data:

tblInvoiceHeader             tblInvoiceLines
InvoiceID InvoiceID
InvoiceName LineDesc
LineAmount
LineTax

Unfortunately for the poor financial controller who has this data integrity problem, there is nothing easy can do except make sure that if an Invoice Line Item is changed, then the Tax is changed, and the Line Total is changed, and the Header record is changed too.

The lesson from this is to never store data which can be calculated from other data in the same, or related record(s).