- Host: Steve Stedman / Shannon Lindsay
- Recording Date: April 1, 2025
- Topic: Blocking and Deadlocks
- Listen on Spotify
Stedman SQL Podcast Season 2 Episode 13 Blocking and Deadlocks
In this episode, we explore two of the most common performance issues in SQL Server: blocking and deadlocks. Although they might appear similar at first glance, they behave very differently and require different approaches to troubleshoot and resolve.
You’ll learn what blocking is, why it happens, and when it’s considered normal. We’ll also explain what causes deadlocks, how SQL Server automatically handles them by killing one process, and why that matters for your applications. Along the way, we’ll walk through real-world examples, share tips to prevent both problems, and explain how indexing, query tuning, and good transaction practices can make all the difference.
Podcast Transcript
Steve Stedman 00:00
Welcome to the Stedman SQL Podcast, season two, episode number 13. I’m your host, Steve Stedman, and today, Shannon Lindsay joins me as my co host. Welcome, Shannon. And would also like to say welcome to our new and our regular listeners on the podcast. This is our 27th episode since we officially turned our streaming into a podcast last year. So today we’re going to talk about some interesting things related to deadlocks and blocking.
So, picture this, your SQL Server grinds to a halt. Queries are stuck, users are frustrated. Is it blocking? Is it a deadlock? Today we’re going to kind of untangle the difference between those two things and what they really mean. So blocking and deadlocks are common SQL server issues, and they sound very similar, but they behave quite differently, and knowing the difference can save your sanity when you’re trying to deal with these. So what we’re going to do is going to take a look at kind of defining each of these. We’re going to explore when they happen, and we’re going to share some tips to help prevent or to resolve them. First, blocking then.
Shannon Lindsay 01:27
Yeah, what is blocking on a SQL Server?
Steve Stedman 01:32
Well, what blocking is this is an important thing, because it occurs when one transaction is holding a lock on a resource, like a row in a table, for instance, and another transaction is waiting for that to be released. This is really important, because without that, you would get you wouldn’t have data integrity. And what that means is that if Shannon, if you’re trying to update a row in the database, and I’m trying to update that same row at that same point in time, one of us has to finish that update before the other one’s allowed for theirs to run. Otherwise, you kind of get this undefined state of two people modifying the same thing at the same point in time. So the blocking is really important because it helps ensure that data integrity, but it can be bad if it’s blocking for too long, and you get into an issue where it’s clogging up your system.
Shannon Lindsay 02:28
Why does blocking happen?
Steve Stedman 02:30
Well, we kind of touched a little bit on why it happens because they’re to kind of keep data integrity and keep two queries or two people from modifying the exact same row at the exact same point in time. But why it happens and why it becomes a problem is that you end up, if you’re inside of a longer running transaction and you have excess locking, like, for instance, table locks instead of row locks, or poorly optimized queries in there, it can end up, instead of blocking for a split second, like what’s intended to ensure data integrity, you can end up with a block running for minutes, or even 10s of minutes. I’ve seen it where sometimes the block is gone until people have had to terminate the blocking query, and it would and it would have run for hours. So yeah, it’s not a problem when everything’s running fast. But where it becomes a problem is when it happens, when it takes a long time to run.
Shannon Lindsay 03:24
Can you share, I guess, a real life example?
Steve Stedman 03:28
Yeah, absolutely. So a couple years ago, and gosh, I could go every week, we’ve done with something like this. But a couple years ago, we were working with a client that had a really busy website. They had some event going on that they were having well over 10s of 1000s of users, maybe closer to 30 or 40,000 people on their system at a time. And this was a web system that was using SQL Server as the back end, and there was a login query where what it was doing was it was trying to get these 30,000 plus people all logged into the system because they’d shown up to for the thing that was going on there. And what was happening is, because they were all trying to log in, it got stuck, and the query was taking as long as 30 seconds to run to get them logged in. But then you had 30 seconds blocked by another 30 seconds, blocked by another 30 seconds, and you had like, 3000 people stuck waiting on that 32nd login, meaning they were never going to log in. So you end up with stalls and just completely locking up the system.
And even if you go in and, like, terminate the one query that’s causing a problem, the next one in line behind it, because there were so many users trying to do the same thing, ended up blocking and locking, and we ended up with slow performance and but the thing with blocking is, if you give it enough time and the load eventually goes away, the blocking does eventually clear. But if this. Is your website that people are trying to log into for something that they’re trying to do right now, that long time for it to clear may be 30 or 40 minutes, or it might be three or four days.
So I put together a quick video. Shannon, do you want to go ahead and share the blocking video, and I can kind of talk through some of the stuff that we see with that. Okay, so what I’ve done here is I’ve created a database called blocking test. I’ve created a table in there, and I’ve just put 1000 rows into that table, and the table just has an idea, a good or unique identifier that I’m using, and I’m just going through and I’m updating it, but I’m doing that inside of a transaction, and then I’m doing a select statement, but what I’m going to do in one window is leave the transaction hanging, and then there are three or four other windows. What I’m going to do is run that same query so that it’s showing up and it’s figuring out or so it’s showing up as blocking, so the one transaction that was left open is currently doing the blocking. The other three queries I ran there are being blocked by this example. Years ago, I was working with a developer who was debugging on the live system, and they stepped halfway through the code and ended up in a position where they left the transaction open and they got a phone call. And similar thing happened to what we’re seeing here, we go into Database Health Monitor, we can see that there is the blocking query. You could click on it and just go kill session, and that would terminate the one blocking query and let the other ones through. But what it’s showing you here is this is the blocker and then through the blocking tree, it shows all the other ones. There’s also a right click menu option there that we can use in order to click and terminate all of the queries that are in the blocking tree. It’s usually best to figure out what’s going on and figure out how to fix it, but in a real bad scenario, sometimes you just have to kill them all off and let them pile up again and get things cleared.
So here’s the blocking by Hour Report and Database Health Monitor. And then this is the current active blocking queries right now, the ones that are currently blocking right this minute. What we’re also going to take a look at is the SQL performance monitor, which is part of Database Health Monitor, and I actually built this around a blocking scenario, and being able to catch those blocking queries that are running during some major event that might be happening on your website. So you just open up the performance viewer from Database Health Monitor, and what you can see in the kind of the middle panel there is blocking queries. And you can see that it’s showing the query that’s being blocked as well as all the queries that are being blocked by it. And if we sit here for a minute, let it run, it’s checking about every 10 seconds or so, and it’s increasing the block count, meaning the amount of queries that have been blocked over time are going up and up and up. Now let’s say you’ve got a busy system that has several blocking queries. What this is going to do is help you identify what are the most, the biggest blockers, and then you gives you what you need to understand what’s blocking, and then you can go and figure out how to fix them. One thing that’s handy with this is it also gives you missing indexes that it catches. Well this is going on, so if we go back, we can go right back to the blocking query report, and we’re just going to go into the first one there and click terminate it or kill the session, and then we’re going to refresh. And you can see that as soon as we killed that one session, all of the other queries cleared out. And we can see that it completed over here, because all we did was it ended up closing out the transaction, and then the other queries that were running in SQL Server Management Studio were able to finish running. So that’s kind of the summary overall on blocking queries. And you can see in the blocking or in the performance monitor here, the count of blocking queries is no longer increasing in this report. So that kind of sums up the blocking queries, how we find them, and how we see them. With Database Health Monitor, we can go ahead and clear that video from the screen for now and move on to our next topic.
Shannon Lindsay 09:04
So you kind of described how you can go through and find the blocking in a way. But what’s kind of like your main tool for, I guess, being able to spot or find the blocking?
Steve Stedman 09:19
Well, really the main tool that I’m looking at is usually Database Health Monitor to look at the things that were that we just went through with that demo. And then once I get that, I’m going to find the query, take the queries that are blocking and being blocked into SQL Server Management Studio and go look at, well, why are they taking so long, or why are they causing blocking, or what’s happening there in the demo we just did, it was a contrived example. It was made up basically where we left a transaction open. Hopefully, that’s not the case in the real world. Usually in the real world, we’re going to find some query that’s taking a long time to run, and the reason it’s causing so much blocking is that it’s taking too long to run. Sometimes. We from that point, we go and figure out how to performance tune it, and sometimes that’s as simple as adding the right index. Other times, we’ve got to refactor or rewrite the query in order to make it work.
But other places that we look are the sis.dm, exec requests or SP, who to take a look and see what queries are currently running and what queries are being blocked. I kind of prefer Database Health Monitor to see that, but if you don’t have Database Health Monitor, well you can get it, but you can use the assist.dm, exact request or SP, who to get those similar things. And then the other thing we’re going to look at is look for weight types. And one of the common ones is like LCK_M_IX which stand for it’s a shared lock, or it’s an intent exclusive lock. If you’re doing an update, you need to get an exclusive lock. If you’re just doing a read, you need a shared lock. So those kind of things will show up to kind of give you an idea of what part of the query is having trouble.
Shannon Lindsay 10:59
How can you fix the blocking?
Steve Stedman 11:02
Well, there’s a lot of ways to fix it, and really the key thing here is to figure out how to keep your transactions short and sweet, meaning, don’t open a transaction and go through a loop and let it run for several minutes. That’s almost guaranteed to cause blocking. Optimize your queries to reduce the lock duration. So if you have a transaction that’s being open and you’re doing some work there, figure out how you can get that part to complete faster. The other thing you can do is take a look at the isolation levels. One of them that we’re going to talk about in a future episode is READ COMMITTED snapshot isolation level, which is one of those things that you can change to adjust how blocking works. But the disclaimer I want to also give an isolation levels is some people will say, let’s go with READ UNCOMMITTED, or let’s go with no lock. And those are two things that can be dangerous and can give erroneous results, so adjust isolation levels. Generally means consider something like snapshot or READ COMMITTED snapshot.
So and kind of to wrap up the whole topic of blocking, before we go into deadlocks, is that I think of it this way, blocking is like the line at a coffee shop. It might be annoying, but it’s temporary. If you speed up the barista, it may clear out faster, but I guess don’t take that as a just throw more CPU in your SQL Server to make it faster, because sometimes, with the blocking we see, just adding more CPU wouldn’t necessarily fix the problem. So keep that in mind. Blocking is one of those things that it happens, but it happens frequently, and it’s an important part of SQL Server and its general operating system, but when it happens for too long, is where it really becomes an issue.
Shannon Lindsay 12:56
All right, we’ve covered blocking. So what is a deadlock on SQL Server?
Steve Stedman 13:01
Well, a deadlock is a little bit worse than blocking. What a deadlock means is that there’s been a clash with two or more transactions where they get hung up on each other so that neither one can proceed. It’s like a circular standoff, and SQL Server fixes it by just determining one of them that it’s it picks one of them and it just kills that one and rolls it back. So that can happen in a lot of ways, but the difference between really that and what we looked at with blocking is that your blocking will eventually finish, hopefully, but with a deadlock, one of them is being chosen and terminated, and it usually happens pretty quick. And if you’re not aware of this, to know that you’ve got code or transactions running that might be being terminated, you need to really understand those and determine what kind of a impact will that have on the business case.
Shannon Lindsay 13:55
So why do dead locks tend to really happen?
Steve Stedman 13:58
Well, the key is you end up with some design in the code that puts you in a situation where you can’t get out of it. Now, imagine a made up world where you’ve got a car and somebody else has a car, but neither of these cars have reverse. You’ve only got forward, and you both get, like, stuck halfway into a parking lot, and maybe it’s maybe your car does have reverse, but you’re just too, too stubborn to use it, and you and this other car get stuck, both going into the same parking spot. Now there is no way that whatever happens here that both of you are going to fit into that same parking spot, and eventually, whether it’s being stubborn or getting out and pushing a car out of the way, or whatever it may be, one of those cars is not going to make it into the parking spot, and that’s the equivalent of the deadlock victim in SQL Server transactions are kind of like that, where if you end up with one query that modifies table one and then another table that modifies table, or, sorry, another query that modifies table. Two and then they each modify the other table. You end up in a position where the two queries have modified data in a way that you’re going to have blocking, just like we talked about in the earlier section. But it’s blocking that’s kind of a circular blocking where one of them stuck waiting on the other one, and no matter how long you wait, it will never resolve itself. And in that case, SQL server could just loop forever and let that run and block forever. But what it does, it decides that there’s a deadlock here. There’s no way it’s going to be resolved. So it figures that terminating one of those is better than having them both hang.
Shannon Lindsay 15:40
How will you end up seeing it?
Steve Stedman 15:42
Well one of the things you’re going to see is that you’re going to have, if you’re lucky enough to be running a management studio, or you’re in code that’s catching the error and reporting on it, you’re going to see that it fails with error 1205 and it says transaction was deadlocked. And it basically with, if you get 1205, it says transaction was deadlock, and this query has been chosen as the deadlock victim, meaning it was canceled, it was rolled back, and the other query that it was deadlocked with will have run. Now that may not always give you any clues as to what other query was. The problem with the deadlocking, and that’s why we get into some ways to go into proactively track that.
So let’s go ahead and play the deadlock video right now, and we can walk through, kind of some of the steps on how we use Database Health Monitor to find this. So here’s an example where in one query window I’m updating Table A, and in the other query window I’m updating Table B and then Table A. So it’s two tables in opposite orders, both of them wrapped within a transaction. If I run one and then the other, Oh, I gotta highlight the right spot when I run it here, I guess. But if I run one of them and then the other thing I’ve done is I’ve put a wait for delay in the middle so that it pauses, so that blocks a little bit longer. And what happens when we run that is eventually one of those queries will be dead, chosen as the deadlock victim. But if we go into one of the several deadlocking reports, we can go and see, here’s a list from the top down, of the most recent deadlocks. We click on it and it shows in here. Here’s the query. The left one is the query that was shown as the winner. The right query is the one that was chosen as the deadlock victim, and it has a bunch of information about who was running it, what program it was running, from things like that. And we can go and also look and see, well, which database has the most deadlocks with it. This big stuff database is just one of our test databases that we’re working with there, and we can see that table A and Table B are the two tables that have all the deadlocks being associated with it. And if we look at this deadlocks by our report, we can see that there haven’t been anything for the last week up until the last few minutes. So that’s kind of an overview of the deadlocking report. There we have them both at database level and at the instance level. And these came these reports were added as a way to answer the question of what’s going on with my deadlocks. How can I find the problem and how can I get them fixed?
So I think that wraps up the video. Well, I guess the key to proactively track that is to use Database Health Monitor. We have six different reports in there. Can also check the SQL server error log sometimes, or use extended events to capture the deadlocks yourself, if you don’t have to, like Database Health Monitor, another option, which I don’t usually use very often, because we can get most of it out of extended events or Database Health Monitor, is use Trace flag, 1222, and what that’ll do is it’ll make it so that when deadlocks encounter that additional information is written into your error log.
Shannon Lindsay 18:44
So how can you fix deadlocks?
Steve Stedman 18:47
Well, this is the challenge. And the thing with deadlocks is figuring out what’s causing the issue. And one of the ways that people look at it is, well, how can I speed that up? Now, in the example I had in the fake deadlock queries there, I had a wait for delay that was waiting for five minutes inside of each query to finish and let them block. Well, the way to do that would be to speed it up. Now, if I had simply taken that wait for five minutes out, probably would have resolved the deadlocks 99% of the time, but there’s still a chance that you have two queries that are doing that, that same thing, and you’re going to get deadlocking. So the way we really look at it is look at the order of the queries in your code, meaning, if you have one stored procedure that’s updating orders, and then enough, and then it updates order items, and then you have a different stored procedure that does the opposite order, where updates order items and then updates orders. Well, those are two candidates for really high chance of ending up with deadlock happening, so figuring out how to standardize the order of those so that you don’t end up in that scenario where one is blocking the other. The other thing is simplify the transactions so. Lot of the times we’ll see someone put a transaction around a loop, where as instead, the transaction could have been around each iteration of the loop. And then that what that does is it keeps your lock open for a smaller amount of time, and therefore less chance of a tangle or a deadlock at that point. And then the other thing is, and this is kind of the I don’t know, maybe they’re not, not the best option, but one thing that can be done is to add retry logic so that if you do detect that you are in a deadlock, you simply retry it again. And that may help you. It may not, depending on the load on the system. If it’s a really rare occur, occasion that it happens, the retry logic might be great. And really the key is to speed up what’s happening inside of that transaction so that the transaction is open for the least amount of time possible. And by doing that, you’re reducing the chance of it ever hitting and overlapping with someone else.
So the thing to think about here is really deadlocks are like a duel where someone loses, and what you need to do is design your queries with the right order to avoid the shootout in that dual. And once you’re in the dual, there’s no way you’re going to have both of them win.
Shannon Lindsay 21:10
So we’ve covered both the deadlock and the walking. What are the key differences between the two of them?
Steve Stedman 21:18
Well, that’s a great question, and I think that some of the key differences around it are kind of how they’re handled, meaning that a deadlock typically doesn’t go on for a long time. It’s typically terminated quickly, and the query that shows the deadlock victim is cleared out fairly quick, whereas blocking can cause things to pile up and get worse and worse and worse over time. Blocking is more like traffic on the freeway, where you hit rush hour and everybody’s slowed down because everyone’s on the freeway at the same time. Dead locks are more like if someone’s driving the wrong way in a one way street, and it’s going to be jammed up until one of those is changed. So the real comparison here is, how do you resolve them? And there’s really two different ways to resolve them. Well, generally, doing what we can to improve performance on all of them will help in both cases, but blocking will eventually resolve deadlocks. SQL Server kills one of them. So figuring out how to get things in the right order so you don’t end up with a deadlocking or how to have smaller transactions will help with that. So the real thing here, the difference with them on the is the impact, meaning, with blocking, you may end up with your application or your website or whatever’s using the SQL Server database running slow or appearing to be bogged down for an extended amount of time until that blocking clears. Whereas with deadlocks, it’s usually not going to happen for that extended amount of time, it’s usually going to happen pretty quick, and one of those is going to be terminated. And really the thing you’re going to look for here is looking at Active Sessions, or the different reports we have in Database Health Monitor for blocking over time, as well as deadlocks over time, to be able to get a better feel for these.
Shannon Lindsay 23:05
You just kind of mentioned the impact of it. But why does it matter if I’m seeing the blocking or the or the deadlocks?
Steve Stedman 23:14
Yeah, it matters in a couple of ways. One is that it matters because it’s impacting your system, as we already talked about, where it may bog things down. But really the key here is that you want to fix these, and misdiagnosing them can be a really big waste of time, and you need to understand with what’s happening, is it a deadlock, or is it blocking, because you may have to do different work in order to figure out how to get those resolved. With blocking, it might simply be just figuring out how to make the queries run faster. But with deadlocks, it might be having to reorganize things inside of a stored procedure so you get the right order so you don’t end up with that deadlocking again.
And really, I think of it as like blocking is like being in line in the queue, meaning you’re waiting, like you’re waiting for that coffee at the coffee shop, whereas dead locks are more like a crash where things have gone bad, and no matter how long you wait, it’s not going to help it. And you really need to understand which one you’re fighting before you, before you take a stab at that.
Shannon Lindsay 24:19
I guess it all comes down to prevention then. So what can we do to prevent and how can we monitor deadlocks and blocking?
Steve Stedman 24:29
Well, yeah, the key here is really figuring out how to write more efficient query. And more efficient queries can mean a lot of different things here, but if you can end up with queries that have smaller transactions with fewer locks, that’s going to cause less trouble figure out how to speed up those queries using proper indexing and how to reduce the scope of the locks and the duration by smaller transactions. But then you might have a lot of deadlocks going on or. Lot of blocking going on that you don’t even know about. So the key here is, well, I should say you don’t even know about it until it goes really bad and somebody takes out your system. So monitoring, with Database Health Monitor, we have six reports specifically around deadlocks, and we have three or four around blocking. And the key is understanding it. And if you’re on a system that’s growing, you want to be able to catch those deadlocks and catch those blocking queries early and deal with them before they become a much bigger issue. And then there’s other ways to monitor it, with a SQL Server profiler or DMVs, or extended events, things like that. But my preference is use Database Health Monitor, because we’ve got all these things built in to help you find and track these really quick and easy other things, yeah, just tuning the queries, considering different isolation levels, and then really kind of coming up with standards to prevent those deadlocks on things like the orders versus order item example, and figuring out the order that you’re always going to modify those tables, and then back to Database Health Monitor. Again, we’ve spent a lot of time over many years and getting all these things in Database Health Monitor, so it makes it easier for you to find them. And the key here is that prevention is way easier than a reaction. And take this on from the very beginning. However, in some development organizations, sometimes it’s easier to get it prioritized when it’s that thing that’s taking the whole site down. I’d rather have it prioritized and fixed before it becomes that thing that takes your whole site down.
So that’s kind of it. The whole recap is that blocking is just a delay or a slowdown where deadlocks are killing your session. Both are fixable with the right design. And the key takeaway here is you just need to understand the enemy on this the enemy being the blocking or the deadlock, and whether it’s a lock or a fatal query. SQL Server gives you the and with Database Health Monitor, we give the tools figure out how to find these and how to find these and fix them.
So the call to action here is next time your app stalls and you’re wondering, why are things so slow all of a sudden, check for blocking, check for deadlocks, and then see what you can do to fix them and share what you find. We’re happy to get involved. And in fact, we do this with a lot of our clients at Stedman solutions, where we will go and track them, and when we see a blocking situation or a deadlock situation, we’ll jump in and help fix it. So thanks for listening and watching. Thanks for being the co host today, Shannon. Join us next week, where Derek Bovenkamp and I will discuss RCSi or READ COMMITTED snapshot isolation level and some real world customer examples that we’ve had some great successes with that. So also, if you missed it, check out our 10 year anniversary episode that was released on April 1 of this year as our 10 year anniversary of Stedman solutions. And it’s grown from just me wondering if I was ever going to get a paycheck out of it to nine employees that we have today, and nine team members then, and a great team. So remember, all of our episodes are available on YouTube at stedman.us/podcastyoutube or on Spotify with the link of stedman.us/podcastSpotify, thanks for listening and have a great day.
Thanks for watching our video. I’m Steve, and I hope you’ve enjoyed this. Please click the thumbs up if you liked it. And if you want more information more videos like this, click the subscribe button and hit the bell icon so that you can get notified of future videos that we create.
