Following the stunning July 8 mass internet outage which affected roughly half of all Canadians, the company at the centre of the storm said a coding issue caused the shutdown.

The pandemic has meant the internet is even more vital in running businesses. Today, numerous businesses run remotely from news outlets needing to share information, to students needing to connect to their classes through video platforms.

How can major tech companies avoid a shutdown to this same degree in the future? Is a world so dependent on internet connection possible if infrastructure is so fragile that one coding error can cause a 24-hour internet blackout that sidelined key 911 emergency systems and banking machines?

Alexis Kuper, a student at uOttawa and a web developer for a company in Harriston, Ontario was personally affected by the outage.

“I had to push all my deadlines and meetings,” she said. 

Luckily for Kuper, because so many people in the country were experiencing this issue, many of her clients were cut off and understanding of her situation.

For Kuper, projects that were meant to be completed were delayed several days.

Kuper’s experience in web development has taught her that the simplest thing, such as putting an exclamation mark in the wrong place, can cause a system crash and the error can be hard to find.

Two weeks after the outage, Rogers apologized

In the letter company spokesperson said, “we have identified the cause of the outage to a network system failure following an update in our core IP network during the early morning of Friday, July 8. This caused our IP routing network to malfunction.” 

“As a company, if you’re working on something that’s mission-critical, like Rogers, or Bell with their communication networks, you should have a number of safety practices, and checks that you have to go through before you make a change like this.”  

Alastair Lewis, a senior technology consultant at IBM,

“Software is prone to human error,” says Alastair Lewis, a senior technology consultant at IBM, who studied computer science at Queen’s University.

“Whether it’s a business trying to do credit card transactions, or a personal phone call,  it’s going through this communication system. At the core of it, it’s software that’s doing all of the heavy lifting.” 

Service companies are constantly trying to improve systems so that they can be the fastest, most secure and most reliable for businesses and individuals. To stay ahead of the curve, system updates are routine. 

However, as Lewis explained, “every time that you make a change, and you have to deploy it to your massive infrastructure across Canada, there’s definitely a chance for somebody to screw up somewhere along the process.” 

Despite the nature of the human error, “somebody like Rogers should still be held accountable for having an outage like that, which affects millions of people and businesses,” said Lewis. 

“As a company, if you’re working on something that’s mission-critical, like Rogers, or Bell with their communication networks, you should have a number of safety practices, and checks that you have to go through before you make a change like this.”  

Despite safeguards, the error somehow managed to slip through and cause an almost unheard of mass outage.

“A company like Rogers should have the necessary procedures in place to make sure that if a developer accidentally writes a bad line of code, it doesn’t knock out communications for two-thirds of Canadians obviously they should have that in place and I’m sure they do to some extent,” Lewis says he believes Rogers’ lack of proper safety practices most likely led to the outage. Whatever happened the outage is now under review by the Canadian Radio-television and Telecommunications Commission. 

As well the federal government has tasked Canada’s major telecommunications networks with establishing a formal agreement to mitigate the damage of future outages.

Following a recent closed-door meeting with the CEO of Rogers and the heads of other telecommunications service providers, the group was given 60 days to consider emergency roaming, mutual assistance during outages and building out a communication protocol to better inform the public and authorities of any emergencies.

Barry Cross, a business professor at Queen’s University’s Smith School of Business says that the biggest issue Rogers will face in the coming months is customer confidence. “Right now, there’s realistically a pretty significant loss of confidence, loss of faith inside of the Rogers organization,” said Cross. 

“For small businesses, this was catastrophic,” Cross continues, saying small businesses, “would definitely lack some confidence in Roger’s ability to continue to provide seamless and reliable service at this point.”  

“Organizations are thinking about that now, they’re asking the providers ‘Show me that you’ve got some type of risk management of your own in place to prevent this from ever happening again.’”

Barry Cross, a business professor at Queen’s University’s Smith School of Business

The lack of internet wasn’t the only source of irritation for Rogers customers. As Cross puts it, “One of the problems is that they haven’t really done a good job defining what the problem was, in exact words.” Over the almost 16-hour period, there was lots of confusion as to what the source of the problem was or when service would be up and running again.

“We don’t often think of the things that can go wrong until they’ve actually happened to us,” Cross described the wake-up call the outage caused for businesses to consider their providers’ risk management strategy. 

“Most organizations can only deal with a single provider of connectivity.  So it’s not like you can have some type of an easy backup.” The outage could call for businesses to question the risk management strategies of their service providers.  

 “Organizations are thinking about that now, they’re asking the providers ‘Show me that you’ve got some type of risk management of your own in place to prevent this from ever happening again.’”