Page 1
Page 1
Started By
Message

T-Mobile explains why its network went down, hard, on Monday

Posted on 6/17/20 at 4:08 pm
Posted by AUFan2015
Oneonta, Alabama
Member since Oct 2013
1849 posts
Posted on 6/17/20 at 4:08 pm
The Verge

quote:

If you’ve been wondering what could knock out one of the United States’ three big cellular carriers’ ability to deliver calls and text messages — and keep it that way for most of an entire day — T-Mobile now has a partial answer that pertains to its extensive nationwide outage Monday.


quote:

The short version, if we’re reading this correctly: a fiber-optic circuit failed, and its backup circuit also failed, which caused a chain reaction that strained the network to the point that many calls and texts couldn’t make it through.


quote:

The longer version:


quote:

Many of our customers experienced a voice and text issue yesterday, specifically with VoLTE (Voice over LTE) calling. My team took immediate action — hundreds of our engineers worked tirelessly alongside vendors and partners throughout the day to resolve the issue starting the minute we were aware of it.


quote:

Our engineers worked through the night to understand the root cause of yesterday’s issues, address it and prevent it from happening again. The trigger event is known to be a leased fiber circuit failure from a third party provider in the Southeast. This is something that happens on every mobile network, so we’ve worked with our vendors to build redundancy and resiliency to make sure that these types of circuit failures don’t affect customers. This redundancy failed us and resulted in an overload situation that was then compounded by other factors. This overload resulted in an IP traffic storm that spread from the Southeast to create significant capacity issues across the IMS (IP multimedia Subsystem) core network that supports VoLTE calls.

We have worked with our IMS (IP Multimedia Subsystem) and IP vendors to add permanent additional safeguards to prevent this from happening again and we’re continuing to work on determining the cause of the initial overload failure.



quote:

It’s not clear which third-party provider’s fiber circuit failed. There was a report on Monday that Level 3, one of the world’s major internet backbone providers, was experiencing an outage, but a spokesperson told TechCrunch differently.
Posted by SG_Geaux
Beautiful St George
Member since Aug 2004
77976 posts
Posted on 6/17/20 at 6:09 pm to
Someone probably rolled out a bad firmware update or something stupid like that.
Posted by Helo
Orlando
Member since Nov 2004
4590 posts
Posted on 6/17/20 at 6:18 pm to
So A fiber circuit failed and it's backup failed and those individual circuits caused a cascade to bring down it's entire network?

Seems legit
Posted by Hulkklogan
Baton Rouge, LA
Member since Oct 2010
43299 posts
Posted on 6/17/20 at 8:08 pm to
I don't know enough about cell service provider networks to understand how you could have some kinda "ip storm". I know they use a lot of multicast so I kind of wonder if the link failures caused a bug to surface that caused multicast duplication or flooding. That's about all I can think of at that scale.

I've seen a 7609 line card bug out and spew garbage into a layer 2 backbone before, which crippled several core networks within a small isp, but I sincerely doubt T-Mobiles network relies heavily upon layer 2.
This post was edited on 6/17/20 at 8:12 pm
Posted by SarahRobinson123
Member since Jun 2020
1 post
Posted on 6/17/20 at 9:23 pm to
very new
Posted by Inadvertent Whistle
Atlanta, GA
Member since Nov 2015
4375 posts
Posted on 6/18/20 at 7:08 am to
A lot of T-Mobile's network is using voice over IP technology instead of a traditional phone circuit system. The fiber system failed, and the backup system should have routed traffic to prevent the overload but it either failed too or wasn't properly configured. After that all hell broke loose. That's why it was mostly major cities that had the issue. Some of the areas on older tech didn't have the problem.
Posted by arcalades
USA
Member since Feb 2014
19276 posts
Posted on 6/18/20 at 3:08 pm to
quote:

a fiber-optic circuit failed, and its backup circuit also failed,
that's a non-answer. something failed. we already knew that. what about it failed?
Posted by Texas Weazel
Louisiana is a shithole
Member since Oct 2016
8533 posts
Posted on 6/18/20 at 5:00 pm to
They were supposed to migrate over a million Sprint users over to the T-Mobile network that day. My guess is that process broke something in regards to routing.
Posted by 50_Tiger
Dallas TX
Member since Jan 2016
40093 posts
Posted on 6/19/20 at 3:32 pm to
Again it was Cisco.

TMO just fell on the sword for freebies.
first pageprev pagePage 1 of 1Next pagelast page
refresh

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on Twitter, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookTwitterInstagram