COBOL

It’s not “cobalt” – though it is often blue. It’s not out-of-date – though programs written in it 60 years ago require very little updating to continue working today. It’s not inefficient – though many of its newest efficiencies don’t require any programming skill to take advantage of.

COmmon Business-Oriented Language, or COBOL, is the programming language that continues to run the world economy, with over 250 billion lines of it currently running, much of it on IBM Z mainframes, and a large portion of that running on-line under CICS.

How did this language, that Rear Admiral Grace Hopper first led the creation of in 1959, five years before the IBM System/360 mainframe was announced, become so embedded in business processing that it is so pervasive as to be nearly invisible? And why, with so many other languages to choose from, is it still the best choice for world-class high-volume business data processing?

To answer these questions, it may help to rewind to the world immediately prior to the birth of both COBOL and System/360 and see what needs they filled.

Before COBOL and S/360

As anyone who has watched great proto-computing movies like “The Imitation Game” or “Hidden Figures” knows, early computers were people who used mechanical, electric, and electronic machines to help them do their jobs, and the idea of that job belonging entirely to an automated device took some time. Indeed, it wasn’t until a decade after the end of the Second World War that the state of computing was sufficiently advanced for SHARE, the world’s first computer user group, to be founded in 1955. Then, the advancement of computing got serious, as business, government, academic and military organizations began to push for serious advances that would meet their business needs.

By this time, John von Neumann’s stored program architecture had pretty much caught on, and computer programs were being written in machine language, and were difficult to enter into a computer and retain – and had to be rewritten (and re-debugged) all over every time a new computer with a new architecture came out.

So among the first ease-of-use innovations to arrive in the computing world were text-based programming languages. The most important of the earliest languages was FORTRAN, in 1954. It allowed a program to be written once in text, and then compiled into machine language. It was a very numerically-focused language with a somewhat sparse syntax, but it was a strong start. In fact, it even preceded the first Assembly language, which arrived in 1957.

Assembly language had a very different syntax and intention. While FORTRAN created a conceptual layer that allowed the programmer to focus on the task at hand rather than the computer architecture, Assembly language instructions mapped one-to-one to machine language ones, and it was therefore tightly tied to the individual computer architecture a given program was written for, and could not be readily ported (except if using an emulator or other backward-compatibility approach).

These early approaches to programming computers led to many lessons learned about how computers were used, what results were expected from them, and how computer programmers functioned effectively.

Building on these lessons, and experience with earlier efforts such as FLOW-MATIC, beginning on April 8, 1959, Grace Hopper led the group to develop a new programming language that would meet the business and programming needs that had been identified, and set the direction for future computing. Two key elements that resulted were, first, the ability to be compiled for multiple different computer architectures, and second, an orientation towards the type of data processing that businesses use at the highest volume: decimal math and printable characters. Consequently, by the end of the following year, the first working COBOL compilers were available.

The decade that began in 1959 was a time of vision and planning for the long-term future, and Grace Hopper certainly took that perspective. The language was designed to last, continue to meet the evolving needs of the business computing world, and be as easy as possible to program, debug and maintain. Indeed, she’s reputed to have said something like, “I don’t know what the programming language of the future will look like, but I know it will be called COBOL.”

There may be no greater demonstration of the success in designing COBOL to meet the business world’s programming needs than the hardware architecture that followed its introduction half a decade later. I know better than to suggest that the good folks who created the System/360 architecture were merely flattering COBOL by imitating it, but the fact that the two take such similar approaches to decimal and character data is a profound similarity not shared by many other languages or computer architectures. And it meant that COBOL could compile into machine language that very closely reflected the original structure and, especially, data of the source program.

COBOL genuinely “redefines” our understanding of how we handle data. In fact, it is my observation, having written and maintained hundreds of thousands of lines of COBOL, that once you finish defining your variables, a good COBOL program feels like it’s practically writing itself. It’s almost like turning John von Neumann on his head, as the stored data becomes the proto-program.

Face it: when it comes to high-end business processing, it’s all about the data. And if your programming language and hardware architecture know how to handle it without dropping even a bit due to binary floating point conversions and rounding errors, they might be ready for the big leagues.

Thesis

Here, then, halfway through this article, is my thesis: COBOL, particularly as running on the IBM Z architecture, continues to be the very best programming language for maintaining established and building new applications that rapidly, constantly and reliably process vast quantities of business data (i.e. decimal numeric math and displayable character data).

Of course, I need something to contrast it with, and so I’m now going to go around and poke a few eyes. If you feel personally insulted, meet me at SHARE and I’ll get you a drink at the reception to make it up to you.

Platform, Performance and Design

There are three categories of application platform and language combinations that I want to contrast COBOL on Z with, in order to make my point.

And the first of these is the hardware platform.

You don’t have to run your computer programs on IBM Z. You could run them on Intel, or a mobile platform, or some high-end UNIX-enabling architecture. Each of those has its strengths. But none of them was built to handle the sheer volume of data throughput that IBM Z handles. They’re not meant to. They’re either commodity consumer electronics computers that can be cobbled together based on the best price/performance components available at any given moment, or they’re designed for supercomputing levels of processing of data that spends more time in memory than in transit. You may build layers of functionality on top of these, but they were never designed from the ground up for the raw RAS power that only the IBM Z consistently demonstrates.

And the languages that are generally run on them likewise are designed more for utility or scientific or graphic processing than vast amounts of character and decimal data. So, they’ll do it, but they won’t do it with the volume and reliability that we have come to take for granted for the applications that run the world economy.

Of course, and this is the second category, you could emulate the IBM Z platform and run a suitable language (maybe even COBOL) on top of, oh, I don’t know… maybe Intel. You could even name the emulator after a Greek mythological character to make it feel more special. And you know what? You’ve got yourself a great sandbox for playing with some of the concepts and at-rest behaviors of the real thing. But don’t mistake a garden hose for a firehose. Try to push a million transactions per second through an emulated environment, and something or someone will have a meltdown.

That’s two eyes. Now I’m going to get you between them with the choice of language on the only worthy platform, as the third category.

So, you’ve settled on big iron for world class workloads, but you wanna try something other than COBOL. Your options can be divided at least two ways: compiled vs interpreted/tokenized, and type of syntax and semantics. Let’s start with the easy one:

Don’t bring an interpreted/tokenized language to workload that pushes the outer limits of the hardware’s capacity.

OK, I know: gotta do a bit of Java here, certainly some REXX there, maybe even a CLIST or two. That’s fine – just don’t do the high-volume stuff in them or you’ll spend all your time parsing the language and deciding what to do next instead of just doing it. zIIPs or not, you’ll soon find yourself using a far bigger mainframe than your CIO would approve of if you insist on doing your highest-volume stuff in an interpreted language. Got it? Good, let’s talk about the other category.

Horses for Courses

Syntax and semantics: how the language is structured and functions. What are the strengths, what are the pressure points when the going gets tough, and what can make or break it?

I see four categories of language syntax on the mainframe: Assembly language-like, COBOL-like, PL/1-like and free form. And I’m going to discard this last group first.

APL is nice if you’re a genius. But don’t give a multi-line APL program to a junior programmer and ask them to make a change to it. Likewise any other barely-structured language that does cool things. If a junior programmer can’t inherit it and maintain it without negatively impacting its functionality, look for another language.

The same is true for other reasons for scripting-like languages that essentially do data munching and reporting. Great for processing the output of production programs and turning it into status reports for management. Maybe even good for churning out checks and bills at a limited volume. But once you start digging into complex processing of vast amounts of data, something’s gonna give. So, stick to reports and SMF, but be careful about the stuff that is running upwards of millions of transactions, either online or batch.

On the other end of the spectrum are the assembly-type languages. You can do wonderful things with them, if you can afford to keep the geniuses that did so, or hire and train their replacements. IBM and the ISVs have lots of these geniuses writing the systems that help you get production done. But these systems are not your classical applications: they’re systems software and utilities, and they require far more advanced technologists to write and maintain them.

Closer to the middle are the PL/1-like languages, in which I include PL/S, PL/X and C. Great for slightly less low-level systems and utility work. Even fun for writing your own programs to process complicated data like SMF if you can’t afford SAS. But they’re built to be too generic compared to the ultimate language built for business processing applications.

Like COBOL

Yeah, we’re back there: COBOL. Now, there are a number of COBOL-like languages on the mainframe, and the more they’re like COBOL (compiled, have all the strengths), the closer they are to, well, being COBOL. Which COBOL is. And it’s not standing still: among many other leading-edge innovations, today’s IBM mainframe COBOL now supports JAVA, JSON and XML.

Think of it: 250 Billion lines, pumping through 30 Billion business transaction world-wide every day, amounting to the large majority of the world’s most critical business processing… on just a few thousand IBM Z machines. And doing it with the following traits that are definitive in making it so:

  • Efficient, reliable, decimal-focused math.
  • The ability to handle and process vast amounts of character data that can take many different forms, as easily as referring to the right variable names.
  • A syntax that is intended to be so readable that, while you should always document your code, a typical COBOL program will practically tell you what it does if you read it aloud.
  • Conceptual simplicity and maintainability for ease of establishing a functioning legacy that allows you to move on to bigger and better things and not continually have to revisit and rewrite your programs for every new language and architecture that comes along.
  • And, of course, compiled for efficiency, from a language structured to be optimized for exactly the same kind of data that IBM Z was built to process.

OK, fine. But we’re talking programs that were written decades ago. How do you make them run better, and how do you make sure that programs written today are running as fast and well as they possibly can at the highest volume of data?

Well, it turns out that, while IBM may not have specifically designed System/360 to be a direct imitation of COBOL, they’ve been optimizing it for COBOL ever since, including some recent innovations that speed up decimal math by an order of magnitude. Add that into the latest compiler technology built on the lessons learned from every other programming language and then fed back into turbo-charging COBOL, and you have the extraordinary circumstance that you can just recompile COBOL applications you’ve been using for decades and improve their performance by – you guessed it – an order of magnitude.

But no need to be magnanimous about that superiority: it’s about business value, and there are other ways to move even further beyond all other languages, beginning with other options for the compiler. The newest IBM COBOL compiler allows you to have messages generated to tell you all kinds of ways to improve your program, from identifying code that never executes and variables that aren’t used to “secrets” about how your program’s behavior might be different from what you intended.

Memory: Think about it

Of course, once you pick a direction and stick with it, you begin to find even more optimizations that weren’t obvious until you were committed. So, taking advantage of how COBOL uses the mainframe’s newly-vast memory that uses capacity to enhance throughput brings the best strengths of other platforms to bear with even greater effectiveness. It follows that bringing as much data into memory as possible before processing it – for example, by augmenting with third-party high-performance in-memory technology that allows the mainframe to handle data in place rather than constantly waiting for disk I/O and overhead – can have a startling optimization of processing.

COBOL and Kicks

Now, if you’re still hemming and hawing about building your next major application in COBOL on Z, here’s one more kicker that should get you moving: it’s actually cheaper than anything else. You don’t have to buy a bunch of new servers – just add a little bit more capacity to your mainframe at an incremental cost that is insignificant compared to any other platform. And you get to work with COBOL programmers, who understand business and corporate culture and a career attitude to their jobs better than pretty much any other technologists. And they’re always making more of them: someone’s maintaining those 250 Billion lines of code, and it’s pretty straight-forward stuff for them to inherit. So, COBOL for the win: if you want to have enterprise-scale, world-class applications that will still be running in the decades to come, rapidly, constantly and reliably processing vast quantities of business data, there are 250 Billion reasons why COBOL on Z will be there, and 30 Billion reasons every day why it can do the job better than anything else.

Reg Harbeck is Chief Strategist at Mainframe Analytics ltd., responsible for industry influence, communications (including zTALK and other activity on IBM Systems Magazine), articles, whitepapers, presentations and education. He also consults with people and organizations looking to derive greater business benefit from their involvement with mainframe technology.

One thought on “COBOL: Still the Best for New High-Volume Applications After All These Years”
  1. These are a lot of words that don’t say a whole lot. I come out of this with what? You list a bunch of strengths of COBOL but don’t back them up. Your qualitative assessments on their own don’t prove their own validity. I can just as easily say the same thing about C and Intel. You need to give some technical details to substantiate your significant bias. Let’s go through each point.

    – Compiled: I can take this at face value because I prefer compiled languages. But this doesn’t mean much. Most languages are compiled. Also there are interpreted languages like python+numpy that have phenomenal performance handling large amounts of data.
    – Purpose built: I am not at all convinced by this argument. What are some problems with general purpose language that can only be improved by limiting scope? And how does COBOL effectively make use of limited scope?
    – Good at decimal math: COBOL’s decimal implementation just works by limiting the allowed number of digits. This is weak compared to other decimal implementations that have unlimited numbers of digits. Most languages have a decimal data type available.
    – Easy to read: This is highly subjective. COBOL may look like English sentences but there are a lot of assumptions you have to make to understand it. I’m looking at some COBOL right now and it looks pretty confusing to me. I can understand the operation each line describes, but not how it fits into the overall program. I don’t see an advantage to using English words over symbols like + and = which everyone already understands. COBOL is extremely esoteric which I see as a big weakness in readability, but even more so in how easy it is to write.
    – Optimized for mainframes: Why do I care about mainframes? C is optimized for Intel, so I can argue that makes C great. The only example you give is that it uses RAM to process data while minimizing file I/O, which doesn’t sound revolutionary or unique at all. This is a standard practice.

Leave a Reply

Your email address will not be published. Required fields are marked *