I was reading the news on CNN recently about a power outage in California, which reminded me of some serious black outs in Costa Rica a while back. A friend was there on business, and it ruined an important business meeting that he had spent months setting up. It also reminded me of the northeast blackout that happened here in 2003, which caused some problems for us at home.
Settling for less – telephone communication systems
At that time, I still had several “land line” telephone sets in my house because I wanted to be able to quickly check on my elderly mother in the event of a blackout. And sure enough, in 2003 I was able to contact my mother to see if she was okay, and if she needed anything. People relying on cell-phone communication were not able to do that because the cell-phone networks were over-loaded.
And that got me thinking about how reliable the old central-office telephone communications systems were. They have 5-nine’s of reliability – they suffer little more than 5 minute’s downtime per year. These systems (housed in brick buildings called Central Offices) had their own power sources, and your telephone sets were powered by the Central Office through the land-line connections. Most of the business telephone systems back then were equally robust. Same with hotels.
It’s a different story today. The old Central Office infrastructure is still there, but if folks even have land-line telephone sets in their houses, they’re typically powered sets that will not work when the power fails. (Is your home phone plugged into a UPS? Mine’s not.) If there were another 2003-style power outage tomorrow, the cell phone networks will again fail as everyone calls their mothers at the same time. And most of their fancy land-line phones won’t work either.
Business phone systems are nowhere near as robust as they used to be either, but generally they’re plugged into UPS systems, so they’ll still work – for an hour or two. Hotel phones, about the same, although some older hotels and motels have antiquated telephone systems, so they may be okay.
Settling for less – computing systems
The same thing is happening with computing systems, and right or wrong, I blame Microsoft. You see, years ago most business computing was done on mainframe computers, super computers or mini-computers. Mini-computers are pretty much extinct. Super-computers are used in specialized scenarios – university labs, scientific research, etc. Mainframe computers are used only by the biggest banks, financial services companies, insurers and retailers. They too are gradually going away, but that may not be a good thing.
The turning point was the introduction of the IBM PC (running DOS and then Microsoft Windows) which changed everything, for better and for worse. The big positive change was cost – now even the smallest companies could afford business computing. The negative side didn’t seem too bad. Sure this new form of computing was less powerful, but it was good enough. Sometimes these affordable computers would freeze up, but you could turn it off and turn it back on again, and that would usually do the trick – back to work. Sometimes we needed to reload the software, but that usually worked too. In some cases we needed to reload the OS, or even buy a whole new computer. But in the greater scheme of things, that was okay, too. Eventually we got used to this type of performance – systems failure was okay. Why? Because it was only temporary, and usually, we were up and working again soon. But mostly it was okay because it was cheaper.
Conditioned to low availability
So now we are conditioned to telephone communications that are less than 100% available, and business (and personal) computing that is less than 100% available. It’s all good- we’re used to it. But imagine if everything were like that. Imagine if airliners were less than 100% available. What about the equipment and systems used in medical procedures? The power grid. Nuclear reactors.
Well, there are instances where these things are not 100% available. For example, airline crashes are far more common when there isn’t proper time, money and training invested in maintenance. While a medical procedure may be too costly at home, you may elect so get it done in India – where the risk may be greater. Chernobyl was a prime example of cutting corners on design, construction, maintenance and training, while the Fukushima nuclear accident was an example of insufficient disaster preparedness.
There are many examples of over-engineering—think high-end Swiss wrist watches, German-made automobiles, Japanese digital cameras, laser-guided hand scissors. All very reliable and famously over-engineered. And costly, for the most part. But the reverse—under-engineering—is far worse.
The dark side of cheap
Think of this: You may not care that you need to reboot your PC every once in a while, that you may occasionally lose some work, that your business telephone systems may not survive long after a power outage, that your eBay bid doesn’t go through on that cool Fitbit that you wanted, that your Google search didn’t go through. You can just redo that effort – it’s a small price to pay for cheap computing.
But what if one of your paychecks doesn’t make it into your account because of a minor computer outage at your bank? What if your tax return doesn’t make the deadline because your accounting firm’s system crashed? What if that stock transaction didn’t happen when you needed it to happen? What if your insurer’s check doesn’t come through after your house burns down because of a minor computer outage at your insurer’s datacenter? Well, that sure as hell matters.
For the most part, these “glitches” will be corrected before you’re even aware of them. But sometimes not – sometimes you’ll have to take action to set things straight. And sometimes someone might not believe you… and yes, that’s a big deal. Maybe not as much as an airline crashing, a botched medical procedure in India or a nuclear melt-down. But less-than-reliable computing when it comes to your own personal finances? It matters.
It actually matters a whole lot to the big banks and financial services companies – for very similar reasons. You probably would have a hard time figuring out how much down time costs on your small business server – a few hours or a weekend’s worth of hours, calculated using what you consider your hourly rate. Perhaps $100 per hour? $500? $1000? Expensive, but manageable. However, across the board, Gartner said that downtime costs businesses an average of $42,000 per hour (that was 5 years ago). On the other hand, one of the big banks might measure the cost of downtime at a million dollars per hour or more. Amazon once admitted to losing over $2 million in revenue in a 13-minute outage window.
Settling for less
So in the end are we better off? That’s debatable. Everyone loves their smartphones – at least they never want to be without one. Everyone loves cheap computing, but nobody wants to lose a financial transaction. There’s no doubt that “less” is easier to manufacture and to maintain, and maybe “less” is better for the most part, that is until something goes wrong. “Less” is absolutely cheaper, again until something goes wrong, and then it’s a costly disaster. Personally, I could live without my phone for a while, but I’d feel better if my bank just kept those mainframe systems.
Latest posts by Keith Allingham (see all)
- Part II: Six ways to improve datacenter performance while saving on costs - Apr 25, 2019
- Typical Techniques for Improving Datacenter Mainframe Performance - Apr 18, 2019
- Fast Mainframe Data Access – Apples and Oranges - Jan 17, 2019