Browsed by
Author: Protocol_7

Disk Failure: An Extended Analysis

Disk Failure: An Extended Analysis

Today, I want to talk about my worst nightmare: data loss.

As a server owner, there is nothing more important than data. Collectively, we’ve spent thousands of hours developing and playing iPwnAge, and all of this work is represented as data. And so, naturally, we make sure it’s safe: data loss is creative loss.

We make backups as often as possible, and we even backup the backups across various physical locations and storage mediums. There are daily checks of hardware health, and always-on log analysis for instantaneous notifications of server errors. Everything runs off batteries to ensure that power loss doesn’t mean data loss. We hear stories of other’s catastrophic loss, and so we learn a lesson and add another safety net. And another. Until we’re so certain data loss can’t happen that we stop worrying. And when you stop worrying, shit happens.

 

Shit happened.

 

For the past two weeks, unbeknownst to me, the server’s main solid-state disk had been quietly dying. This disk stored  Main, Survival, the server’s databases, and the main operating system. It was a 128GB Samsung 850 Pro. I personally own 11 of them–I just counted. They are solid drives that perform well under lots of small read/writes, which is what a Minecraft server does a lot of. But nothing is perfect, and neither was this specific drive.

“But, aha”, I said. “That’s impossible. I’ve got monitoring tools on the drives every day to check disk health. If the drive was really dying, these tools would’ve reported it.” I knew S.M.A.R.T tests aren’t a good method of predicting drive failure, but if a failure was actively occurring, it would surely know, I thought. Lmao @ myself. Turns out, the tools wouldn’t know disk failure if it was staring right at its face. I know that because a disk failure was staring right at its face and it said “no error”.

That “no error” message was critical. In the logical overview of the backup system I’ve created, the results of the disk health test determine the whole flow. Before anything is copied, the system first checks the disk. If the tools say the disk is OK, it deletes the oldest backup and then makes a new one. If the tools say the disk is bad, it stops everything, notifies me, and freezes everything. Do you see my error? I do. Hindsight is 20-20.

Even at this moment, smartctl returns no error when the 2 minute test is run.

The major flaw in the backup system was that I assumed too much. I placed too much emphasis on the presumptive accuracy of the health tool, smartctl. I thought that if the disk was “healthy”, then it was safe to delete the oldest backup. This was necessary, as a single day of backups was a whopping 2TB of space. We were capable of storing 4 days of backups, which was fine for the past 7 years. The only reason we needed historical backups were to fix grief. If hardware failed, no backups were deleted and no backups were made. We would just roll back to the most recent snapshot (which was at most ~50 minutes old). What we never anticipated was a scenario where the disk reported “healthy” but returned garbage data. It’s something I never imagined. Disks have a ton of error detecting algorithms that let the operating system know if SHTF. Only, my drive never mentioned anything. Not until last week when things got REALLY bad and spitting out “uncorrectable errors”. But by then, it was too late. All the good backups had already been deleted. It’s just backups of garbage data. Did you know that we changed our server name from iPwnAge to �@�@. @DX`S`z@@�`������� @� `���`�(`�`��@``@,A?

 

So, what’s the good and bad news? Good news first, FTB and MMC are completely unharmed. They’re stored on a disk separate from the rest because y’all insist on generating 70GB maps. Bad news? Main, Survival, and some important server information is lost (most notably, the economy and previous bank levels). Also, the server will be offline for a bit longer while I rewrite the backup logic to prevent this from happening again. Also also, I’m going to be renting a new server specifically for permanent storage of weekly backups (that’s 110TB a year, so if you know a place that offers storage for cheap, lmk).

I’ll be bringing the disk to a professional data recovery service, so hopefully they can salvage some data. But I don’t have high hopes right now. I’ll keep everyone updated in Discord.

 

Side notes:

  • I’m not exaggerating, 24 compressed backups of the server total to 1.97TB of disk usage
  • The hour-long smartctl extended test does show disk error. But my backup system used the 2 minute test because in the interest of overhead reduction (hourly backups taking the whole hour to complete?)
  • I’ve been writing Aegis2, the new backup system, for awhile now. It includes localized snapshots that only backup changes made within the past hour. The goal is to reduce the total amount of storage necessary which lets us keep more history. But, it would’ve included the same logic as Aegis1, so this failure would’ve still occurred
  • When the disk went from quietly screwing things up to blatantly killing everything, MySQL resource usage skyrocketed since the DB corrupted and it every time it went to fix it, it hit a disk read error, crashed, and restarted.
Scheduled Server Downtime: Internet Service Upgrade

Scheduled Server Downtime: Internet Service Upgrade

Hey all!

This is just a quick notice to all players: Thursday, June 22nd between approximately 12pm to 3pm, the server and all ipwnage.com services will be down for internet upgrade services. We’ll be fitting the servers with a gigabit internet connection.

No map progress or server data will be altered during this downtime. Any questions? Just ask.

 

As of 4:10PM, the server is back online! Thanks for being patient.

Parrots, Terracotta, 1.12 Oh My! [UPDATE: 1.12.1]

Parrots, Terracotta, 1.12 Oh My! [UPDATE: 1.12.1]

Damn time flies. It’s not quite summer yet, but we’re close enough to warrant the good vibes 😎 , so happy summer everyone.

Today, Mojang released 1.12 to the public and the good news is that iPwnAge already support 1.12 clients. However, the server only lets 1.12 clients join: there are no 1.12 features just yet. At the bottom of this article will be a projected timeline for when 1.12 features will be brought to the iPwnAge Network servers. Read on if you want to know more about 1.12

A blue Parrot in the World of Color 1.12 Update
A blue Parrot in the World of Color 1.12 Update

So, what to expect in 1.12? Well, there’s new blocks, new mobs, and a custom achievement system. There are other minor improvements, but also notable is the color palette change: dyes are more neon and poppy than they previously were. So if your pixel art looks off: blame Mojang.

For the new mobs, Parrots and Illusioners were added to the game. Parrots are ambient mobs found in Jungle biomes that can be found in various colors. With beetroots, you can train them as pets and they’ll follow you around similarly to wolves. Illusioners are simply a new hostile mob found within Woodland Mansions from the 1.11 update.

The new blocks are called Glazed Terracotta, and wow they’re contentious addition. Nevertheless, we’re still excited to see what players will be able to do with them down the road. Once 1.12 hits Main, we’ll be providing terracotta kits for players to use for free.

All the various Terracotta patterns. Kits will provide players in Main free access to all patterns.

 

And there’s plenty more changes too, like the ability to dye beds and custom achievements. The server can now push “Advancements” to your clients, which is a way for individual servers to add achievements for their community to earn.

“Press F to Pay Respects” achievement, anyone?

 

1.12 iPwnAge Network Timeline

  1. Network level 1.12 support for backwards compatibility DONE!
  2. Spigot Team releases full 1.12 support (ETA: Friday, June 9th)
  3. Apply all Rei patches to Spigot 1.12 (ETA: Friday, June 9th)
    1. We run our own server software here, which is a combination of public works and our own changes to make iPA more performant and unique.
  4. Test server for 1.12 goes live (ETA: Friday, June 9th)
  5. Iterate over community plugins and update all necessary 3rd party compontents (ETA: Saturday, June 10th)
  6. Official iPwnAge Network 1.12 release (ETA: Saturday, June 10th)

We’ll be providing plenty of updates as the hours tick on and we march towards full 1.12 support. Looking forward to it!

 

UPDATE: We’re now live on 1.12.1!

 

 

Scheduled Server Maintence: All Servers will be Offline!

Scheduled Server Maintence: All Servers will be Offline!

Hey everyone! This is a quick notification to all that the server will be down for hardware maintence this Sunday, March 19th from 10am EST to 11:30am EST. This maintence is necessary to upgrade storage capacity as our server maps continue to grow. During this time, all three Minecraft servers will be offline. Teamspeak and the website will continue to function like normal.

Winter Updates & Architect

Winter Updates & Architect

Woah, where does the time go? We’re almost half-way through February and it’s been awhile since an update post. Well, let’s fix that!

Not much is happening in FTB these past few weeks. About two months ago, we “temporarily” prevented new Twilight Forest portals from being created as a method of preventing so much TWF land generation. The result was really amazing: since then, the server has only crashed 3 times; 2 of which were WorldEdit-related crashes. This means two things: A) Twilight Forest was responsible for the near-daily crashes we experienced in FTB, primarily due to a bug in land generation. and, B) It means that Twilight Forest exploration has also dropped drastically. While we introduced the temporary ban as a stopgap measure, we did not expect that a solution would take this long. It’s a significant problem that we’re aware of, and actively trying to solve. We hope that in the next two weeks, we’ve have a permanent solution so that player can continue to use Twilight Forest as normal.

Vanilla 1.12’s new Glazed Terracotta blocks

In other news, Mojang has released 1.12 snapshots for public use, which (hopefully) means the 1.12 release is right around the corner. This update brings new, vibrant terracotta blocks, and a redesigned color palette for clay and wool blocks. We’ll keep you all updated as the release of 1.12 gets closer.

And, lastly, Architect. It’s one of our most OP ranks, reserved specifically for creative players who enjoy building. The rank grants the typical item spawning and flying, but unrestricted WorldEdit, Voxel, and WorldGuard use as well. There’s no time requirement for this rank, to boot. If you love to build, but hate the item gathering process of Minecraft, this rank is great for you. Which is why we’re opening two Architect slots. Doesn’t matter if you love playing in FTB or Main, the rank applies to both servers (not the survival server, since that’s purposefully a vanilla-only server). Think you got what it takes? Head on over to our Apply for Architect page and submit your builds! We’ll announce the two winners on a rolling basis.

Scheduled Maintenance & Staff Position Opening

Scheduled Maintenance & Staff Position Opening

Hey all! Hope the holiday mad craze hasn’t got you down yet. This is just a brief notice to let everyone know that on Saturday, December 17th at 11:59PM Eastern Standard Time, the server will go offline for approximately 30 minutes to an hour for routine server hardware diagnostics and upgrades. During this time, all regular servers will be inaccessible. Instead, there will be a fallback server setup with GM1, and players are free to be as degenerate as they wish. No changes to the map will be saved during that 30-60 minute interval.

I also want to announce that Drew, Termites, and I are officially opening up a fourth staff position, and we’re accepting applications. If you’re interested, send us an email at staffapps@ipwnage.com with a paragraph or two about you, and why you wish to become staff. Any prior experience with server administration or leadership positions should be mentioned. Brag about yourself! The application window will close Saturday, December 17th at 11:59PM EST (the same time the server goes down for maintenance), so if you’re interested, don’t hesitate!

‘Tis The Season!

‘Tis The Season!

It’s December, which means we can say “Happy Holidays!” and you’re not allowed to get mad. So, Happy Holidays! 🎄

 

We’ve got some updates and changes to the servers that are worth mentioning, so we’ll go over those first. On the FTB server, we’ve introduced Mineworld, a dimension specifically crafted for quarry usage. Players are encouraged to move their quarries over to the new world, but there are no rules as to where quarries are put. If you still want to quarry in the Overworld, that’s still okay! This new dimension is a super-flat, plains biome with normal ore generation. /rtp is enabled to quickly find a spot to quarry, and we ask that players keep their spots to under 500×500. To get there, use “/warp mine”, or “/warp mineworld”.

2016-12-03_20-54-39
Mineworld: The most awesome, boring world you can imagine!

Thanks to some new players, we’ve discovered some bugs within FTB mods, and we’ve updated the Known Bugs page accordingly. Unfortunately, all of the new bugs found don’t have an official fix, so our solutions are mostly workarounds (such as dropping an item and picking it back up again 😉 )

Continuing with FTB news, we’ve enabled the ability to teleport to coordinates for all players Trusted or above. This allows the built-in FTB Waypoint system to teleport to set Waypoints. Pretty neato! For both the Main and FTB servers, we’ve reduced the rank requirement for private warps to Commoner. Now, any player Commoner or above can use /warp pcreate to create unlimited personal, private warps.

screen-shot-2016-12-05-at-1-29-54-am
The Tinker Table, in its craftable glory

We’ve also fixed MachineMuse recipes, so all Power Suit Armor and related components are completely craftable. We’re using the Thermal Expansion recipes, in case you were wondering. In other meta-server news, we’ve begun work on separating the Twilight Forest map from the server and placing it on a standalone server. That way, if Twilight Forest crashes, it only crashes the players within the Twilight Forest! We’re not sure if it’ll work as well as we want to, and so this functionality may never be implemented in the real FTB server. But for now, we’re working away in our development software trying to find a solution to this very unstable mod.

Ronin_Jedi and Peebs are also offering a giveaway of their current base! If you’re interested in joining, all you need to do is comment below in this post! The winner will be drawn on January 1st, 2017! Here’s a brief overlook of their current base.

We’ll be in a great holiday mood, so keep a look out for random events and gift drops as we get closer to “the most wonderful time of the year”. As for most holidays, Spark will be offering some gifts to all on Christmas Day, so after you unwrap your gifts, stop by iPwnAge for an in-game gift, as well 🙂