Editing Archived:February 2023 lag incident

Jump to navigation Jump to search
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
{{Infobox event
The '''February 2023 lag incident''' began when [[Attempt2]] suddenly began to experience extreme TPS lag, to the point of rendering the server entirely unplayable. As no solutions, up to and including wiping the entire server have resolved the issue, it is believed that it was caused by a fault on the host's end.d
| title                    = <!-- Title to display, if other than page name -->
| image                    = Feb2023incident.png
| image_upright            =
| image_alt                =
| caption                  = A screenshot of the unplayably low TPS during the incident.
| native_name              =
| native_name_lang        =
| english_name            =
| time                    =
| timezone                =
| duration                = {{time interval|February 4 2023 1:48 a.m.|February 7 2023 6:42 p.m.}}
| date                    = <!-- {{start date|YYYY|MM|DD}} or {{start and end dates|YYYY|MM|DD|YYYY|MM|DD}} -->
| venue                    =
| location                =
| coordinates              = <!-- {{coord|LAT|LON|region:XXXX_type:event|display=inline,title}} -->
| also_known_as            =
| type                    = Server lag
| theme                    =
| cause                    = High resource usage on server host's end
| motive                  =
| target                  =
| perpetrator              =
| first_reporter          =
| budget                  =
| patron                  = <!-- or |patrons= -->
| organisers              = <!-- or |organizers= -->
| filmed_by                =
| participants            =
| outcome                  =
* World rolled back 5 days
* User homes lost
 
* Major decline in user activity, leading to the [[Great Reset]]
| casualties1              =
| casualties2              =
| casualties3              =
| reported deaths          =
| reported injuries        =
| reported missing        =
| reported property damage =
| burial                  =
| displaced                =
| inquiries                =
| inquest                  =
| coroner                  =
| arrests                  =
| suspects                =
| accused                  =
| convicted                =
| charges                  =
| trial                    =
| verdict                  =
| convictions              =
| sentence                =
| publication_bans        =
| litigation              =
| awards                  =
| url                      =
| blank_label              = <!-- or |blank_data= -->
| blank1_label            = <!-- or |blank1_data= -->
| blank2_label            = <!-- or |blank2_data= -->
| website                  = <!-- {{URL|example.com}} -->
| notes                    =
}}
 
The '''February 2023 lag incident''' began when [[Attempt2]] suddenly began to experience extreme TPS lag, to the point of rendering the server entirely unplayable. As all possible solutions, up to and including wiping the entire server failed to resolve the issue, it was believed by the server administration that it was caused by a fault on the host's end. This was later confirmed on February 7th, 2023, when the server host's technical support team confirmed that the server was running on a node that was experiencing high resource usage, and moved the server to a different node, upon which the issue was immediately resolved.
 
The failed attempts at fixing the issue resulted in the rollback of the world by several days, the loss of several plugin configuration files, and several days of downtime. However, the affected plugins were successfully reconfigured, and compensation was provided to all affected players.
 
== History ==
 
=== PixelPrinter and server performance ===
Attempt2 has used the PixelPrinter server plugin since its inception for detailed graphics in and around the world. A main function of this plugin is CreateFrame, allowing the user to create download images using item frames. This can, when used excessively, generate hundreds to thousands of item frames, causing heavy amounts of entity-based lag. Although PixelPrinter is not accessible by non-staff, staff have been occasionally seen to use it as a method of griefing.
 
Attempt2 previously used 5GB of Server RAM before the lag incident. Server owner [[nc77812]] stated he planned to upgrade the RAM multiple times, however never ended up following through. The server had not experienced major performance issues before the incident, although its predecessor server, The PLA Network, did experience a major lag incident after its August 2021 public opening. This was remediated by an upgrade from the Spigot to Paper API.
 
== Incident ==
 
=== PixelPrinter Grief ===
On the evening of February 3rd, 2023, the server had been fairly active. Mod [[Serenity7321]] had specifically decided to paste the attic of player [[dwrr_]]'s with hundreds of PixelPrinter-generated item frame pictures of Chinese communist revolutionary [[wikipedia:Mao_Zedong|Mao Zedong]], which had previously been downloaded to the server by Admin [[RandomUser34]]. She had done this same picture-based grief before in the [[Mount Xavier Compound]] storage room a few days before, totaling 855 item frames. The grief this time, however, had totaled a significantly larger amount of item frames, causing the server to experience item-frame-based entity lag. All item frames had been removed shortly after via the /killall server command by Serenity. The server recovered immediately after this.
 
=== Decision to upgrade server memory ===
Server owner nc77812 had decided shortly thereafter that he would upgrade the server RAM from 5 to 8GB effective immediately, announcing his decision in the Attempt2 Discord server's #voice-general chat. He cited future performance concerns as the reason to upgrade. The server then restarted at approximately 2:00AM EST.
 
=== Beginning of Incident (2:00-4:00AM EST) ===
After the server restarted, lag issues immediately became apparent. As people joined, chunks around them failed to load. Server TPS (ticks per second) dropped from the regular 20 average to a near 1 to 5 average, rendering the server virtually unplayable. nc77812 and RandomUser34, joined with the feedback of dwrr_, attempted to troubleshoot the problem, by performing things such as mass killing of entities, and attempting to find a root cause of the lag by installing and using the Spark server diagnostics plugin. As it became clearer that their actions were leading nowhere, multiple things were suggested to try and solve the problem. RandomUser34 and dwrr_ suggested that the issue could be fixed by switching the server to the Fabric API, a move which was swiftly rejected by nc77812, who insisted the issue instead involved the server world, citing previous issues with PixelPrinter-derived entity lag.
 
Eventually, nc77812 made the executive decision to delete the current server world and revert the server's world to its last backup of January 30th, 2023, explaining that it would be easier than switching the entire server and its plugins over to Fabric, which used a completely different plugin/addon system than Paper/Spigot. This action generated immediate controversy from the playerbase, but was nonetheless executed.
 
=== Continuation of Incident (Feb 4-5) ===
Server Admin RandomUser34 continued investigation into the server lag issue. After trying many things, including running Spigot, Paper, and Fabric APIs, and even running the server as a clean-slate, Vanilla server with a new world, it became clear that the issue was the fault of the server host. He immediately contacted host support thereafter. After the incident, it was discovered that he had accidentally wiped both the server's End and Nether worlds, of which there was no backup.
 
=== Support resolves issue (Feb 5-7) ===
Support was relatively slow to respond, due to the incident timing occurring on a weekend. After about 2 days, support acknowledged they had transferred the server to another node, which had been overwhelmed with high resource usage. This essentially had throttled the server's performance to a point of unplayability. The host then transferred the server to another node free of charge. The server immediately reverted back to normal performance after the transfer. The server reopened later in the day on 7 February.
 
== Aftermath ==
Multiple builds built between January 31st-Feb 4th, including all of the server's End and Nether progress, had been lost in the incident. Owner nc77812, who faces backlash due to his mishandling of the incident, pledged that all of those impacted by the incident would be "reimbursed" by "any means necessary".
 
The incident coincided with a vote on server [[referendum]] Proposition Two. Due to the unusual circumstances surrounding the poll's timing, it's results were declared null and void, and a new poll is being conducted currently.
 
By February 9th, a new public Enderman farm had been built to replace the one lost in the incident, as well as a new spawn-end gateway.
Please note that all contributions to A2wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see A2wiki:Copyrights for details). Do not submit copyrighted work without permission!
Cancel Editing help (opens in new window)