[Nov 18 2019] - Service degradation - [solved]

Stay up to date with shard happenings
Locked
User avatar
Red Squirrel
Posts: 29193
Joined: Wed Dec 18, 2002 12:14 am
Location: Northern Ontario
Contact:

[Nov 18 2019] - Service degradation - [solved]

Post by Red Squirrel »

There seems to be a weird issue with the database server. I'm at work with some access but can't really dedicate too much effort but basically it failed. The shard is still running but if there happens to be any kind of reboot it will revert to about 5 minutes before this post.

Going to look at it further when I get home. Some script went haywire and started to fill the disk space so I think it's related.

You can still login and play but keep in mind there is a chance of revert if it happens to crash for any reason.

Archived topic from AOV, old topic ID:6795, old post ID:39412
Honk if you love Jesus, text if you want to meet Him!
User avatar
Red Squirrel
Posts: 29193
Joined: Wed Dec 18, 2002 12:14 am
Location: Northern Ontario
Contact:

[Nov 18 2019] - Service degradation - [solved]

Post by Red Squirrel »

So this could be a while. Looks like the database server got compromised somehow. I really don't know how as there is no real easy way to access it. I do run a Minecraft server off it, so it could be there is a remote code execution vulnerability or something. All the file permissions are messed up which is why everything is failing but either way I need to rebuild the VM. I really don't have time to deal with this this week so it will have to wait and given the traffic on the shard I figure this probably won't be a big deal to anyone... I will keep the shard running anyway but keep in mind there will almost certainly be a revert.

EDIT: May not be a compromise, there is a possibility the script actually did this. Something really got messed up somewhere to cause the script to go nuts in first place but there is a command to change permissions too so it could all be related.

Anyway calling it quits for now I need to rebuild the VM either way will do that another time. There probably won't be data loss, but can't say with 100% certainty until I actually rebuild the VM and restore a backup.

I'm treating this like a compromise to be safe but the more I think of it, I really think it's the script going nuts that did it.

Archived topic from AOV, old topic ID:6795, old post ID:39413
Honk if you love Jesus, text if you want to meet Him!
User avatar
Red Squirrel
Posts: 29193
Joined: Wed Dec 18, 2002 12:14 am
Location: Northern Ontario
Contact:

[Nov 18 2019] - Service degradation - [solved]

Post by Red Squirrel »

Managed to get it going. It's kind of crippled and I had to do some pretty nasty hacks to get everything to work, but I got it going.

Pretty much 99% sure it was not a compromise, it was really a script that went nuts for one reason or the other. I have a really oddball disk I/O issue on my entire network that has been the bane of my existence since the very beginning. I cannot figure out what it is, but basically it starts causing my VMs to act messed up. I think that's what hit and then it caused the script to go wonky.

So database is back up, shard never had to revert and the save backlog is caught up. Rebooted shard to make sure everything is good and it seems ok.

I should probably rebuild the database VM at some point but for now this will do.

Archived topic from AOV, old topic ID:6795, old post ID:39414
Honk if you love Jesus, text if you want to meet Him!
Locked