Sunday 6 October 2024

How I investigate a slow SQL query (for beginners)

Note: This instructional is just an outline of how I begin looking at a "slow SQL query". It's aimed at "full stack developers" or other people who are trying to get started in looking at SQL performance tuning. 

All the software I show here is free, and I do not receive any form of referral fee. And links to scripts that I used are also credited. 

So ... with that out of the way, let's begin!

Let's assume we work as a developer and we're calling a SQL stored procedure named QueryResultsGet. Here is a sample of the call, and the data: 



You can see there that this took 15 seconds to run, to return 168 rows. 

So the question is ... is 15 seconds a valid time to wait for this number of rows? Obviously, for the sake of this demonstration, we're going to say no. 

So what do you? 

Well, here is what I do. 

The first thing I do is ensure I'm in a test environment that is as close as possible to Production. Normally I would use a UAT instance that was recently refreshed with live data. This is because I want to make changes to the code and/or underlying schema, and don't want to affect Production. But I also don't want to use a Dev environment, because often they've been hacked by developers and the size of the data could cause very different results. 

Next, I am going to take a snapshot of the current results. These "results" include not just taking note of the 168 rows returned, but a few other things. 

I want you to get a SQL query, any query that does a few joins will do, and then type this above it: 

SET STATISTICS IO ON;

Next, I want you click this button on the tool bar: 


And then run the query again ... 

Now that you've re-run it, a couple of extra things happened that didn't happen before.

Firstly, when you click the Message tab in the results panel, it displays a whole lot more information than it showed before. 

(If you want to compare, change your code to be  SET STATSTICS IO OFF; and re-run)

Here is what the Message tab shows in this example: 


Now ... I want you to download a an application called 
WinMerge, or go to a site called https://www.diffchecker.com ... or anything similar. There are lots to choose from. I'm a creature of habit, and have been using WinMerge for years and so it's my default. 

Regardless of what you chose, please copy/paste that text on the left pane: 


The other thing that I asked you to do was turn on the Actual Execution Plan, and that would've resulted in a 3rd tab appearing in the results pane. When you click it, you'll see something like this: 


Now, I don't know about you ... but that's rather hard to read, and make sense of ... so instead of using that, download SQL Sentry Plan Explorer from here https://www.solarwinds.com/free-tools/plan-explorer 

Once you've done that, right click the plan in SSMS and show the plan XML. This will open a new tab were you can copy the XML. 

Finally, paste that XML into SQL Sentry Plan Explorer and you should be able to see it like this: 

(Note: there is also the option to get the SSMS plug-in which means you can right-click in SSMS and open in SQL Sentry Plan Explorer straight away).

Look at how much easier that is to read!!! 

Why Microsoft hasn't updated the SSMS Execution Plan diagram is a mystery to me. 

Immediately when you look at this image, you can see the red and yellow highlighting. This free software is doing my job for you! It's saying exactly where there is overhead. 

In this case it's highlighting tblContactMaster and tblContactDetail ... And now I want you to look for those two tables in the statistics that we saved into our text comparison software: 

Table 'tblContactDetail'. Scan count 342, logical reads 399798, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

Table 'tblContactMaster'. Scan count 589, logical reads 28836, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

So ...  that's the problem, what's the solution? 

Well ... this is the thing ... you've now found "a problem", not "the problem" ... chances are that a query taking 15 seconds to return only 168 rows has a few issues! But for the sake of this demo, let's keep it simple. 

The only advantage I see in the Microsoft Execution Plan display, is that it highlights missing indexes. Sentry Plan Explorer has this hidden (for a reason I can never explain). But anyway, go back and look at the the Microsoft one ... see the green text? 

If you right click the green "Missing Index" and then select Missing Index Details ... it will open a new tab with a script to create the index. 

Now we come to the interesting part ... do you create the index? 

Probably ... if you create the index that it suggests, then it will make THIS query go quicker. But ... indexes take on overhead .. the more indexes you have then the more storage you use and the slower some updates take to do. You don't want to go blindly creating indexes, just to speed up one query which may only get run weekly? 

BTW - Brent Ozar does a great, and hilarious, job of explaining what he calls "Clippy" and the index suggestions in this video here - https://www.brentozar.com/archive/2020/07/video-how-to-think-like-clippy/  

Therefore before blindly creating an index, please copy paste this script from Pinal Dave, and see what other indexes are on this table: 

DECLARE @ObjID BigINT = (SELECT ID From Sys.SysObjects WITH (NOLOCK) WHERE Name = 'type_your_table_name_here'); -- 
;
WITH cteFrag AS 
(
SELECT object_ID, index_id, index_type_desc, MAX(avg_fragmentation_in_percent) AS FragPct
FROM sys.dm_db_index_physical_stats(DB_ID(), @ObjID, NULL, NULL, NULL) 
GROUP BY object_ID, index_id, index_type_desc
)
SELECT 
o.name AS ObjectName
, i.name AS IndexName
, i.index_id AS IndexID  
, dm_ius.user_seeks AS UserSeek
, dm_ius.user_scans AS UserScans
, dm_ius.user_lookups AS UserLookups
, dm_ius.user_updates AS UserUpdates
, p.TableRows
, i.is_primary_key
, i.type_desc
, cteFrag.FragPct
FROM sys.objects o with (nolock) 
LEFT  JOIN sys.dm_db_index_usage_stats dm_ius with (nolock) on dm_ius.object_id = o.object_id
LEFT  JOIN sys.indexes i with (nolock) ON i.index_id = dm_ius.index_id AND dm_ius.object_id = i.object_id   
LEFT  JOIN sys.schemas s with (nolock) on o.schema_id = s.schema_id
LEFT  JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.object_id 
                           FROM sys.partitions p with (nolock) GROUP BY p.index_id, p.object_id) p 
              ON p.index_id = dm_ius.index_id AND dm_ius.object_id = p.object_id
LEFT JOIN cteFrag ON dm_ius.object_id = cteFRAG.object_id
            AND dm_ius.index_id =cteFRAG.index_id 
WHERE OBJECTPROPERTY(dm_ius.object_id,'IsUserTable') = 1
AND dm_ius.database_id = DB_ID()   
AND i.object_id = @ObjID
ORDER BY ObjectName, IndexName;


If this table already has lots of indexes on it (5 is an average number), then consider if what the green text is suggesting can be changed to just a modification of an existing index, before just blindly creating it. 

For the sake of this demo though, let's assume that this table has only 2 indexes, and none of them are like what is suggested, and so we're going to create it. 

So ... create the index and go back, and run the query again ... 

For this demonstration, here are the main results: 


168 rows, in only 5 seconds! 

So, that's good, right? 

Well ... yes ... but ... let's not get too ahead of ourselves ... 

The rows are the same, because we didn't alter how the actual query works. It's always important when changing the logic of the code that we get the correct data foremost. 

But 15 seconds down to 5 ... that's good, yes?  

Yes, but ... maybe the first query was run at 10am when everybody was online, and it's now 6pm and most people have left? Or someone has deleted a mass of data? Have you ever heard about plan caching?  There are these variables to consider, and that's why we now go back to how initial snapshots. 

I want you to go to the Messages tab in the SSMS Results pane and copy/paste that into the right side window of the text comparison tool you chose to use. Does it look something like this? 


In my example here, you can see the logical reads on tblContactDetail dropped from 399,798 to 44,289! 

Now let's compare the Execution Plan? Copy/paste the XML into SQL Sentry and see what it shows. Here's mine for this example? 

Ok, so there is still a LOT of work to do in this query to make it faster .. but you can see already the difference our new index has made. 

Now .. the question is ... how much do we do? We got it from 15 seconds to 5. Is that enough? 

If it's not, then it's a matter of rinse and repeat. You can choose, depending on your business' structure to put that new index into Production, wipe the slate clean, and start again with this new set of statistics and execution plan as a blank page, or keep working on this and deploy all at once. 

But ... this is how I do it. I look at it big picture, and then pick off the low hanging fruit - missing indexes if there aren't many on the table, key lookups, functions on the WHERE clause etc ... and then go from there. 

At some point, you end up where the amount of time and money it costs to get a query to optimal performance, isn't worth the cost to the business for the microseconds it might save. 

It will all come down to experience. 

So please, create a test environment, and experiment. But please don't do anything in a production environment until you're confident in what you're doing. 

If you need any help, please reach out to me.

Rodney

Sunday 29 September 2024

Working out why a SQL Database has grown over the limit

So I ended the previous post with

Oh, and if you're wondering, the issue was that their SQL Express had hit the 10gb limit, all because the space cadet developer had decided to save images into the database. I might do another blog post on how I figured this out as it would be helpful for beginners in the "Emergency SQL DBA" space. 


So here is how I was able to see (quickly) that it was caused by the images:


As part of my SQL Integrity Check, I poke around the tables and look at the DB size etc. I must confess that for a lot of this, I rely on scripts written by Pinal Dave.


The first thing I did though, and remember that I was talking to someone who works in construction and by his own admission doesn't know much about computers, is find out what version of SQL Server this company is using. You can easily do this by running:


SELECT @@VERSION


When I ran this I was able to see that they're running SQL Express. And of course this version of SQL has a limit of just 10Gb per database (more on that below).


I then right-clicked the database name and saw that yep, it had hit 10Gb in size and this therefore aligned with what the business owner had told me was what the error message was saying. In his words "it's saying we're out of space".


So now it's the challenge of finding out why it's hit the 10Gb limit. Has it been incrementally growing at 10mb a day for years? Or was it a sudden thing?


I moved onto this script:


https://blog.sqlauthority.com/2021/02/12/sql-server-list-tables-with-size-and-row-counts/


And almost immediately after pressing F5 on that, the issue was obvious.


The #1 ranked table was one called "Images" ... Out of the 10Gb it was taking up 9 of them!


Next I ran this:


sp_help Images


This showed me the column names and the types.


And to the surprise of absolutely nobody, there was a field called ImageData that was defined as VARBINARY(MAX). There was also a field called ImageCreateDate.


Therefore I ran this:


SELECT TOP 10 * FROM Images ORDER BY ImageCreateDate


At the time of writing it is September 2024. There were images going back to 2011 ... Thousands of them.


The business owner assured me that they had tried deleting the data from the front end. So I then explained how soft deletes work etc ...


He told me that he only needed the past 2 years of data and as he needed the system up ASAP he told me to blow the old ones away.


Now, since I am a Nervous Nelly from having working in this field for over 20 years, I did a backup first. I created a new database called RecycleBin, and copied over 15 years worth of these images to that database.


And then I did the delete.


And then the system was working again.


It's worth remembering that a SQL Server Express database is limited to 10Gb, but you can have multiple databases on the same instance. So if you need to delete data, and still have a relatively easy "undo feature", then this is a way to do it.


I hope you found this useful.


Rodney


Case Study of How I do what I do

Ever heard the phrase "we get satisfaction from our work time, not from our leisure time"? 


Last week was like that.


I was at the beach with my dog Felix at 9:15am on Monday morning when I received a phone call. It was from a small business owner who had found me on google.


"Are you the SQL guy?" he asked ... 


Long story short, it turns out this man owned a business in the construction industry and his system kept crashing. As a builder, he didn't know much about it, except that it runs on SQL Server. 


After this brief discussion, we scheduled a time for later that afternoon for me to remotely view the situation with one of his team who is more "IT literate". And during this call I see that that yes, the SQL database and its front end are having issues and I think I can help. 


The boss then asked me "what is your hourly rate to fix?". 


I replied by saying that I don't do hourly rates. Because that would be dumb.


The quicker I get it fixed (which is what he wanted) then the less I get paid. And the longer I take to fix it, the more I get paid, but the more he will get pissed off. It's a pretty stupid thing to do. 


So I quoted him $99 to do my SQL Integrity Check. I said that after I do this, then I would have more information about the problem and be able to offer an accurate quote to fix. 


I did this check on Wednesday afternoon and found what I suspected was the problem. I then quoted a price to fix, based on this assumption, and said that my work carries a money back guarantee so if I am wrong he doesn't pay. The risk is all mine.


So I implement my solution, and yep, the problem was fixed by Wednesday evening. 


Customer was happy in that his system was operational again, and I was happy as I was able to fix an issue. 


Rodney  


ps. There is something immensely satisfying about being on the beach at 9am on a Monday morning when most people are in the office. 


pps. Oh, and if you're wondering, the issue was that their SQL Express had hit the 10gb limit, all because the space cadet developer had decided to save images into the database. I might do another blog post on how I figured this out as it would be helpful for beginners in the "Emergency SQL DBA" space. 

Saturday 20 May 2023

25 Years of Late Nights and Learning

In the late 1990s my martial arts coach asked me to help keep track of the gym's members as his current way of doing it with a pen and paper was getting complicated as his business grew.

As I had been programming since I was 13 or 14, and had just left school and had begun working as a trainee programmer, I started developing a system to do it. 

Version 1 

It was first designed for Windows 3.1, written in VB5 and stored the data in a text file, one row for each person and all their details. 

The original name for it was Visual Gym Manager (partly because it was written in Visual Basic).

Version 2

Towards the end of the 90s it had progressed. It was now a VB6 WinForms app and I had migrated the backend to Microsoft Access, with a relational table structure. 

It was around this time too that I renamed the product to be Arnold - not directly because of the famous body-builder actor, but rather because that was one of the server names at the company I used to work at, and I thought it sounded cool as a name for software too. 

Then, in 2002 came the big launch of .Net! 

Version 3

Given that I had never gone to university, and was almost entirely self-taught … I went out and spent $2000 on Visual Basic .Net (remember those days? buying a box of CDs) .. I also bought a “teach yourself VB.Net” book and got to work re-writing the entire front end in VB.Net. 

It remained relatively stable in this format for a number of years. Regardless of where I was living (Sydney or Rio de Janeiro), I would use my spare time away from work (which was mainly bouncing at this time) and jiu-jitsu training developing the software. 

Version 3.5

When I returned to Sydney full-time in 2006, and resumed working 40 hours a week as an IT professional, I gained more exposure to SQL Server (which I had first touched whilst working in Rio). So it was in circa-2012 that I migrated the backend from Access to SQL Server. The front end still remained a WinForms VB.Net app but the backend was now using SQL Server Express. 

This was obviously a step up but I still had the problem of having to install it on location at Boxing Works’ reception PC and at my own gym’s premises, and also at a 3rd location that I had picked up. 

And everytime I made a change, I had to roll out the executables and go physically to each gym and do SQL. 

Version 3.6

The next logical step then was to move the database to an online server, so I did that. And this made doing back end support a lot simpler … but I still had the problem of installing the front end on each PC which became a nightmare, especially as I had now picked up clients in New Zealand and Darwin!

Anyone know the pain of DLL Hell ???

Different versions of Windows, and time zones to support … not to mention I was working 40 hours a week programming, and teaching my own jiu-jitsu classes … I didn’t have the time to do the support. 

Version 4 

There was only one logical step to do if I wanted to keep my software alive - make it a web app. But I had never written a web application before. Up until now I had only written millions of lines of Visual Basic for windows, and become very good at SQL Server. 

So I googled how to write a web application in Visual Basic, and found some tutorials on ASP WebForms. Keeping it in VB allowed me to use most of the .Net Classes and libraries I’d already coded and so I just had to redo the screens, which was still a lot to learn. 

But, after a few months of late nights I got it done and working. It looked really ugly … but it worked! 

Version 4.5 

It was now 2020 and I had become a Microsoft Certified Expert in SQL Server (I somehow found time to study too) and the software was working … but I wanted to integrate with a 3rd party who could do the direct debits in a PCI compliant way. I spent hours going through their documentation but couldn’t figure it out. I remember speaking to their support guy and I said something like “I’m a SQL Server developer and I don’t know how to do web programming and your documentation sucks” … his reply was “well mate, you’re trying to do web development so go and learn it!” … it was the blunt truth. 

So I phoned an old colleague I used to work with and paid him some cash to help me with the integration. He got that working, and I mentioned how the thing looked ugly, and he said “oh that’s easy, just find a bootstrap template and use that”. He was right. I’d never heard of bootstrap until then … so another $20 for an online course which I watched on a trip to the city and back (I was working as a DBA at PwC at Barangaroo by this stage) and I got the hang of it … and yeah, it now looked presentable! 

Version 4.6 

Just after I got it looking better, the webhost where I had it hosted suffered a massive DDOS attack which took the application offline for almost 2 days. And as much I like supporting small businesses (I still use the same company for my sales websites), I lost 2 clients as a result of this downtime and so it was time to scale. 

Today both the web application and the sql database are hosted on Microsoft Azure. 

Version 5

The shiny “new” version 4 did really well. But … there is a reason that ASP WebForms applications are almost extinct. The whole postback thing where the entire page gets reloaded almost every time gets annoying, especially if the user navigates via the browser buttons which then can cause multiple actions of the same thing happening … 

I knew what I had to do. I’d been putting it off long enough … rewrite it as an MVC App. 

It’s taken me the best part of 6 months worth of late nights and weekends, but I’ve finally finished converting 90% of the code from Visual Basic Webforms to C# and Javascript in a new .Net Core 6 web application. I’ve had to learn so much more for this too (authentication and authorisation for example) that it has been a long road, but very rewarding. In fact 3 years ago I’d never written a line of javascript in my life, and now I have this application which is probably 25% js. 

There is still some VB code there, the nightly jobs and things like that which don’t involve user interaction will stay for a while. But the user experience is so much nicer and faster. 


The main thing I am super proud of is the underlying database structure. It’s remained almost unchanged from the original design I did in Access. With one click of the button, the business user can enrol the member into the class, and it verifies their payments are up to date and calculates all other stats like classes for grading and sale expiry etc. 

It’s proof that if you build the engine of the system correctly, then everything else is just window dressing. It’s like the chassis of a car … do a google search for "cars with the same chassis" and you'd be amazed at how many different cars actually use the same basic blocks! 

I don’t know how to do a lot of things … but one thing I know how to do is build a database. One that’s fast and works. 


Coming Up Next 

It’s now time to go back to market and focus on getting some new subscribers, and in the meantime my development time will be returning the Apps … yes, I also wrote an Android and iOS app for mobile phones too which allows members of the gyms to check in and receive push notifications of alerts. But it’s clunky and also needs a re-write. Will need to find a new course to take and start the process again. 

It’s been 25 years of development, and I reckon I’ll do another 25. 

I’m super proud of this product and I honestly couldn’t run my gym without it. 

If you know someone who runs a martial arts, dance or yoga studio .. or even a small PT style gym, and needs some simple software to help run their business, please do me a favour and share the link with them. My software is free for businesses with less than 25 members and I know it’ll make a difference. 

Thank you.

Rodney Ellis

ps. Link to trial can be found here: https://gymdatabase.co