Easy listen – 6 minute video describing possible database corruption scenarios that’s actually happened. Help yourself and avoid these mistakes. Presented by Steve Stedman and Derrick Bovenkamp.
Transcription:
Steve Stedman 0:03
Now, imagine with this picture, if that was your fiber optic cable running through the middle of that fire and you’re doing a backup or trying to load a database or something across the network, you’re probably not going to expect your connection to last very long. But while it’s burning up, it may be sending some data through, and it may be getting corrupt. So I just thought that this was one of the more unusual possible causes of corruption. But hopefully, you never look outside of your building and see this type of situation with your network cables.
So there’s a lot of different things that we’ve seen from experience with different customers that have led to a corrupt database. And the first one was the very first customer that I ever did a database corruption repair for. And this was a lightning storm. And what had happened is they had their accounting system running on a, we’ll call it production in quotes production SQL Server, under their desk, with no power strip, no one interruptible, power supply, nothing like that. And they were in an area that was frequented by lightning storms, lightning storm hit, we think the server got some kind of a jolt, and then it shut down unclean. And when it started back up, there were chunks of the database that were not not coming back not accessible. And there was maybe in that situation, maybe about a dozen tables we had to repair. But the cause the exact timing of when things went bad, they could pin down to the lightning storm. Of course, this is a situation where customers didn’t have any backups. And that always makes it interesting. But I think that one could have been avoided if they’d had a, perhaps like an APC battery backup system or something that filters the power.
Derrick Bovenkamp 1:56
Yeah, so the next one here, this is not one that I’ve personally seen, I’ve heard the story two or three times at sequel pass summit. And it’s somebody that they had a network room installed on the side of the building close to train tracks. And they started to have drive failures like nobody had ever seen before. And it took them a while to realize, but they finally realized that every time the train went by, the building would shake. And you’re almost going yourself like oh, that’s not gonna cost drive failure. But there is a video on YouTube of somebody. Now this is specifically with spinning drives, you know, old regular spinning disks, not solid state drives. But there is a video on YouTube of somebody screaming into a an array someone like you see on the picture of the screen now. And when they scream into it, when you’re monitoring, latency, the latency would go up on the desks every time the guy would yell at the array.
Steve Stedman 2:58
You know, that reminds me, Derek, cuz I about I don’t know, about 18 years ago, I worked in an office that was right next to the railroad tracks. And when the passenger train came by, he didn’t really notice much other than a bit of shaking. But when a freight train came by, especially the ones that were carrying, or cars, or these big bins of scrap metal that they would haul by, and this was back before we had all the flat screen monitors. And I actually had a CRT type monitor where when those cars would go by with or or scrap metal, it would actually create enough of a magnetic field that you would see the monitor, picture, move and distort because of that magnetic field as it was going by. And I can imagine that would just not be good for any type of storage or death situation as well.
Derrick Bovenkamp 3:43
Yeah, okay. Can I get shivers down my spine when you’re thinking about a disk drive to store down with that? Yep. The next one. Here is another one that I’ve it’s actually two stories here. But Steve will tell one, but this is one that I have heard of. And it was a janitor with a vacuum cleaner in the middle of the night. And the server was on the ground. And every time they would vacuum the room. I maybe not every time, but it would happen enough. The vacuum cleaner would run into the server. And as that you can imagine it’s also not good for those spinning drives.
Steve Stedman 4:19
Oh, yeah, absolutely. And then the other situation, again, blame the janitor. But this was a situation where people had, I wasn’t part of it. I only heard the stories of it later, where they had a server that twice a week, the server would just power off and reboot. And after months of tracking this down, they eventually found out that this server was a server that was just sitting in an empty cubicle in an office. And the reason it would power cycle was the janitor would come into vacuum and they needed an outlet. So they’d pull the power cord out for the server, do the vacuuming and then plug the server back in and you know that can’t be good for your database.
Derrick Bovenkamp 4:53
Yeah, so the next one here is you know, it’s really hard for me to talk about is a systems administrator that really likes technology. And he really likes to think very highly of some of the storage technology that we have today. But we’ve we’ve seen at least two cases where we were able to trace back corruption to something with an i scuzzy switch, or, you know, something with that connection between the storage and the server both with I scuzzy. So network switch errs, they certainly, certainly can cause corruption.
Steve Stedman 5:30
Yep. And then the last one we want to talk about here is drive failure. Now, if you have complete drive failure in your hard drive is just totally dead. That’s a different situation. Because either you’ve got a backup or you’ve got a RAID array of some kind, or completely lost all your data. That’s not corruption in that case, but where you have a partial drive failure, where things are starting to fail with the drive, and you’re getting things that are not being written or read correctly, that can lead to the database being corrupt if it reads, and it has a read error, but somehow when it writes it back out, it writes it out with incorrect data. Yeah, that can definitely lead to database corruption. But really, most of the time, it comes down to problems with IO, somewhere between the SQL server process and when it actually lands on disk. There is something that is corrupting, distorting, manipulating or changing the data in some way. That when you try and read it back later, it doesn’t look like a regular database file.