Once upon a time, a website existed that recorded some of the most spectacularly failed mainframe migration projects. Of interest to all mainframe type people, I’m sure that it saw a lot of traffic; a Google search on ReBoot Hill yields the website, but it no longer appears operational; the page seems corrupted. Quite a shame, since the stories were really amazing.
Well, as a consultant for a mainframe software company, I saw fit to save some of these stories a few years ago, so I now present them to you for your information and enjoyment. Based on some of the figures cited in several of these stories, it’s obvious that some of them are quite old, but the recurring themes are both informative and shocking. Enjoy!
$100 Million Plus
Amount invested: $100M Plus
Organization: Motor Vehicle Licensing Agency
Original Plans: To replace a mainframe system with a client/server system (Sequent/Unix) to “lower the costs of computing.”
Unanticipated problems: The application could not support more than 50 users with an ‘acceptable’ response time of 10 to 15 seconds. With 100 users online this frequently increased to minutes. The supplier, scrambling for some type of solution, actually put data integrity at risk by convincing the customer to remove record-locking protection on updates. Their rationale was that “it’s a one in a million chance of two users accessing the same record simultaneously.” After this change the response time improved to 10 to 15 seconds for up to 100 users, but still increased rapidly thereafter. The problem is that as they issue around 3 million licenses per year the “one in a million” chance actually occurs about three times per year when duplicate licenses are issued to different vehicles.
Project Status: The mainframe was finally replaced, some three years later than expected, but the user still uses an outsourcing firm to run and maintain their legacy mainframe applications, at a cost close to the original total mainframe cost! The eventual budget for the replacement system was three times the old mainframe budget and six times the anticipated cost. In addition there has been a 50% increase in clerical staff to handle the same volume of transactions.
Amount invested: $25 Million
Organization: Insurance company
Original Plans: To implement an image-based system to reduce paperwork, speed up the issuing of policies, increase availability (there were frequent line problems between remote locations and headquarters reducing availability to around 98%) and allow for a doubling of business with no staff increase. A response time of one to two seconds was crucial to this latter requirement. The distributed Unix-based solutions proposed were all significantly cheaper than the mainframe based ones, even though the insurer had a large mainframe-based system. The IT Department was in favor of a mainframe based solution as it had many reservations about the likely performance of the Unix solutions, and none of the proposed Unix solutions could be demonstrated at anything like the volumes required. However, the promised improvement in availability swung the decision to the Unix platform.
Unanticipated problems: The initial Unix servers were too small to handle the volumes and the overall capacity was quadrupled before the system went live. Performance is adequate when few users are working simultaneously, but as soon as the workload increased the response times increased to anything from 30 seconds to one minute (on occasion, worse than that). It is this erratic response time situation that caused most of the problems as the users get frustrated and even resort to hitting any key on the keyboard and even kicking the system on occasions ‘to make it work’. Of course, this just increases real downtime, and overall the availability of the system rarely reaches 90% over a one-month period. As a result, there has been an increase in the time required to issue policies and no business increase can be supported without a large increase in staff.
Project Status: A review is being conducted into moving the application to the mainframe although there are some political hurdles to this approach. The only alternative is to quadruple the investment in Unix servers to provide more capacity, but this would then be far more expensive than the mainframe solution.
$100 Million Plus
Amount invested: $100M plus
Organization: Automobile Insurance Company
Original Plans: To sell and service automobile insurance over the telephone using a low cost Unix application package and a single Unix server. The application was to be expandable as the business grew, eventually supporting up to 1,000 concurrent users at peak times.
Unanticipated problems: The new application could not support more than a dozen users with an ‘acceptable’ response time of 2 to 3 seconds. The system also crashed almost hourly. Consequently, the application was rewritten and customized for that Insurer. The new application still required multiple Pyramid Nile servers to support a 40GB database and a maximum of 100 users per application with 400 in total. Response times were variable, up to 30 seconds when heavily loaded, and the system still crashed on a weekly basis when heavily loaded.
Project Status: To handle the business growth this Insurer tested every available Unix platform but could not find one to support 750 concurrent users. They installed a mainframe-class system running a proprietary version of Unix to meet this level of use, but still experienced 30-second response times and frequent outages. Their cost per policy has been measured at over six times that of a rival company that was using a S/390 mainframe system.
Amount invested: $500M
Organization: Large utility
Original Plans: To replace all mainframe systems with a client/server based Unix/NT solution to improve customer service and enable more rapid applications development to exploit market opportunities in this rapidly changing market. It was also considered that the year 2000 conversion of the legacy applications would be simply ‘money wasted.’ To ensure success the utility entered into an outsourcing contract to handle everything from the operation of the mainframes to the implementation of the new systems.
Unanticipated problems: Shortly after the project began the outsourcer insisted on a renegotiation of the contract terms, which put more of an onus on the utility to support the new systems locally. The suggestion being that so little support would be needed that local personnel could undertake the required tasks as part of their normal daily routine. However, over time it became apparent that the support requirements of the new system were more than triple those of the mainframe with the mixed Unix/NT environment, causing many problems. At around the same time the mainframe capacity needed to be enhanced as conversion was well behind schedule, but again the outsourcer insisted that this required yet another renegotiation of the contract. These two changes combined to give a new total cost of well over twice the original mainframe costs compared to the 25% reduction that had been expected. The final problem was that rather than experiencing more rapid implementation of new applications, the poor management software and limited development tools had resulted in the average new application taking twice as long as on the mainframe.
Project Status: There is no confidence that the new systems will all be operational before the year 2000, so many of the legacy applications will need converting anyway and the mainframe will require further investment. Consequently, many of the planned client/server Unix applications have been abandoned and the legacy applications will be converted for the year 2000 and also enhanced to become client/server systems interfacing with NT.
$200 Million Plus
Amount invested: $200M plus
Organization: Subsidiary of major oil company
Original Plans: To replace their central mainframe system with a full distributed Unix client/server system running SAP applications.
Unanticipated problems: The application implementation period was well over three times the initial estimate and the package costs increased four-fold due to the more powerful servers required even at the pilot stage. The support staff were also increased substantially from the initial estimates with expertise required at each location. The SAP skills required were in short supply forcing the users to pay much larger salaries with a negative effect on the existing staff who left almost ‘en masse’ before the mainframe systems were decommissioned. Despite the high salaries, most of the SAP skilled personnel moved on after a very short period, causing more disruption and delays. As the application nears completion it is obvious that no Unix server yet built can handle the volumes of the larger sites, and multiple servers just increase response times as the database is distributed still further.
Project Status: The user is re-centralizing and ‘porting’ the database part of the application back to the mainframe and will ultimately move it back to DB2 as the Unix database being used requires over three times the DASD capacity, and as much as ten times the processor capacity to do the same job. The current cost estimate for the mainframe is under 50% of the Unix cost for a system that didn’t even work.
Amount invested: $100M
Organization: Major insurance company
Original Plans: To replace a central mainframe system with an HP distributed client/server Unix solution. The objectives were to improve customer service, to lower IT costs and most importantly to handle the increased business volumes expected in this era of takeovers/mergers and booming financial services business.
Unanticipated problems: The advice from the software and hardware suppliers was to implement a very small pilot application in some smaller locations to get users ‘comfortable’ with the new systems. This pilot had only five or six users maximum and was implemented with little problem. However, as this application was ‘enhanced’ and moved out to larger locations a major problem arose. Scalability was not linear. For ten users almost four times the capacity was required compared to the pilot system and for 50 users the required capacity had increased by a factor of twenty-five! At the 100 user level, which was required in some locations, there was no available system that could meet the capacity requirements. It was apparent that the planned business growth could never be handled with this solution and also they had learned that their business was really centralized anyway. The annual cost estimate for this Unix-based application had also grown from around $1B to at least $5B compared to the mainframe budget of around $1.5B.
Project Status: The project was scrapped with no applications retained from the Unix environment, and the mainframe capacity was increased to handle the increased workloads.
$250 Million and counting
Amount invested: $250m and counting
Organization: Foreign subsidiary of major international bank
Original Plans: This subsidiary was recently purchased by the major bank to provide a presence in a new geographic area. The computer systems were in the process of being changed from a central mainframe system to a distributed Unix Client/server solution to improve customer service and reduce costs. As this project was already underway, the new parent is leaving the plans unchanged at this time.
Unanticipated problems: The Unix servers originally specified (from a variety of vendors to keep them all competitive) were all under-configured by a factor of four to five times. In most cases, this means that there is no system large enough available today. The project team also found that as larger servers became available the later software releases tended to use up the additional capacity without increasing the actual business throughput or transactions. However, probably the single major concern is that the bank is experiencing data loss—especially when something fails. The result is that recovery from a failure can take many days before full integrity can be assured.
Project Status: The corporate head of IT has been charged with making this system work, but currently he does not see how this can be achieved—at any price. His only consolation is that he can very quickly implement their mainframe-based worldwide systems if a real disaster looms.
Amount invested: $100M
Organization: Transport company – Railroad
Original Plans: With the breaking up of what was a national railway, one of the divisions decided to implement all new systems (purchased from another railroad) on a Unix platform to replace the mainframe applications used before the split. The annual costs proposed by an independent software supplier who project-managed the implementation were around half of the estimated costs of continuing to utilize mainframe based systems. So the implementation cost was put at twice the ongoing annual mainframe costs to give a four-year break even point.
Unanticipated problems: The applications from the other railroad required extensive modification to meet the needs of this railroad and also had to be ‘tailored’ extensively to communicate with the other railroad companies created after the breakup. This increased the implementation time by two years and doubled the planned implementation. Running costs have also doubled as more capacity and support are needed than originally planned. In the interim, the user had to continue running and maintaining the old mainframe systems, which they did through a Facilities Management company that charged 150% of the expected costs due to the short-term nature of the contract. The independent software company went bankrupt leaving the user to pay all of the increased costs.
Project Status: Despite the problems the user is pushing ahead and hopes that there will be no more ‘surprises.’ They are aware that over an eight-year period rather than reducing costs by 25% in total they will have increased them by around 140%, but there is still a belief that “mainframes are Dinosaurs” and a further belief (or perhaps a hope) that ultimately Unix costs will become lower.
$100 Million over several years
Amount invested: $100 million over several years
Organization: Financial organization
Original Plans: To develop a client/server workflow and imaging system to front-end the mainframe. Create a Sun/Unix environment with >1000 workstations supported by >100 Sun servers running a popular RDBMS for Unix. The investment decision was based on vendor promises of low cost, high availability, and top performance.
Unanticipated problems: The new system and Unix application could not support 100 users, with a response time of 120 seconds being quite common with only 50 users online. With the required 100 users online this frequently increased to 5 minutes and the system became unstable with crashes and loss of data. The user replaced the initial HP Unix system with one from Sequent, but there was only a minimal improvement. The number of support staff had doubled during the 3 years this project was being attempted and the lack of systems management software had caused a number of catastrophic failures with some data lost and never recovered. The true cost of this has never been discovered, but is certainly in the order of $100m.
Project Status: The user decided to scale-back the application and move 80% of its function back to the mainframe, effectively letting the application still do some image retrieval. Plans for scaling back the application also called for the replacement of workstations and GUIs with a thin-client device.
Amount invested: $100M
Organization: Retail chain
Original Plans: To replace central mainframe with distributed Unix/NT client/server solution to lower costs and improve profitability through improved stock management. The core of the application was to be a fully distributed database available to all locations online so that customer service could be improved. Overnight queries on sales could be made by marketing utilizing the new database.
Unanticipated problems: Staff querying stock levels at other locations to meet immediate sales requirements of on-site customers or telephone queries effectively ‘locked out’ local staff from accessing the database to handle on-site customers. The overhead incurred by locking records, making ‘phantom’ allocations and then backing them out of the system was incredible, and response times often deteriorated to in excess of one minute. A doubling of the original capacity made no impact whatsoever and the opinion of experts consulted was ‘no amount of money could resolve the problem with the current level of distributed database software.’ Furthermore, there seemed little real likelihood of future enhancements fixing the problem. The actual sales process increased in time and customer satisfaction fell as customers could not get the sales attention they required. In addition, at each location expertise was required to handle the myriad of problems that occurred, which increased costs substantially. The marketing department found that queries were taking hours to process, and quite often one query would lock out another one.
Project Status: The system has been modified to use a central database. Now each location is utilizing a server with their own stock records, which are updated at the central site as stock is received or sold. Queries on availability from other locations now receive a 1 to 2 second response and the time taken for the local sales process has been dramatically improved. All of the Unix servers have been removed from the various locations and the NT servers have been downgraded in some cases. However, at least one of the Unix servers was utilized in the new environment as an extract from the DB2 database is made available at the end of each day for the marketing department to use for their queries. A future move to network computers is expected to improve things further by providing more hardware throughout the stores for no increase in cost.
$300 Million over several years
Amount invested: $300 million over several years
Organization: Large telecommunications company
Original Plans: Move all legacy applications off the mainframe (Amdahl/MVS/DB2) to UNIX/Oracle onto Sequent super minis.
- Discovered that after only 10% of applications moved, they needed two times as many system programming staff for Unix than for the 90% still on MVS. It took 10 Unix system administrators to manage 20 Sequent servers.
- Sequent I/O rate of 180/sec created performance bottlenecks. The mainframe was handling I/O’s at 2,000/sec regularly, and 3,000/sec peak.
- Normal response time for applications on the UNIX/Oracle systems was 5 seconds.
- Unix systems required twice as many disks as MVS, had to mirror data because they lost disks frequently. Oracle needed twice as much disk as DB/2.
- Security administration for Unix took twice as many people as for MVS.
Project Status: Project abandoned, company left majority of applications on mainframe, started investigating the possibility of using Windows NT servers to replace Unix servers.
$400 Million over 5 Years
Amount invested: $400 million over 5 years
Organization: Large financial company
Original Plans: Develop a new and highly strategic application on client/server, defining client/server as Unix-based distributed systems with GUIs at the desktops. The new application was supposed to allow users to concurrently update data on the Unix servers. Selected hardware and software to build the new Unix-based infrastructure. The intention was to merely supplement the predominately mainframe environment with Unix.
- Were expecting that data servers would handle 100 users running the newly developed application doing concurrent updates. However, end-user response time became unacceptable with only 10 users active per server.
- At this point, since millions of dollars had been spent on the project already, they replaced the systems with servers from another vendor as an attempt to salvage their already sizeable investment. They continued to invest many millions of dollars only to learn that newer servers also failed to deliver acceptable performance with 10 users active on a server doing concurrent updates to data.
- Synchronizing both the centralized mainframes and the distributed Unix environments added an incremental layer of costs that were not anticipated, and were many times more than expected.
- Reported that Unix systems management, software distribution, and root-cause problem analysis were “nightmares.”
Project Status: Abandoned. Basically, they looked for creative ways to use the Unix-based hardware that was otherwise collecting dust.
$500 Million over several years
Amount invested: $500 million over several years
Original Plans: To create administrative systems for >50,000 users. System to be completely Unix-based, no other operating systems, large or small, anywhere. Users to access system from X-Windows terminals, supported by 6,000 Unix-based servers. Servers sized to run 16 X-terminals each. Provide tight security controls at all levels. Build a private network.
- Performance much worse than could be tolerated, even sending a simple e-mail message.
- Numerous points of failure, overwhelmed by frequency and length of delays in all areas.
- System too complex to manage.
Project Status: System redesigned around the Windows NT operating system. NT, which is replacing Unix, is described as “the fastest growing commercial networking software on the market today.” Regarding the original system requirements and its completion date, the redesign eliminated many of the original deliverables in areas of applications, security, and networking. Even with the reduced requirements, the project will be delayed 8 years.