PROBLEMS AND SOLUTIONS

Denver's Baggage Problems

The Denver International Airport's automated baggage system experienced such horrific problems that most with an opinion on the matter are thrilled to elaborate on their sense of what went wrong. It seemed that what could go wrong, did go wrong. Even the signs directing passengers to the baggage claim led to a concrete wall. Unfortunately, analyzing the true nature of the system's faults is not an easy task. Problems were so widespread, that possibly no small number of reasons can alone account for the chaotic performance in the system's early testing. Insight can be found in examining the accounts of some key people who were involved in the baggage project.

Expert Opinions

In response to criticism after the third opening delay, BAE president Gene DiFonso explained, "We simply ran out of test time" because of changes requested by the airlines, problems "working around other vendors," and failures in the airport's electrical power supply. Denver aviation director James C. DeLong maintained that baggage software glitches and electrical supply harmonics were late and unexpected obstacles to opening the Denver International Airport. According to David Hughes of Aviation Week & Space Technology, contributing factors to the baggage system's problems included concrete mechanical, electrical, and software flaws. William B. Scott of Aviation Week & Space Technology believed that the system's troubles originated in more fundamental miscalculations such as overall system complexity, underestimation of tasks, a steady stream of changes requested by both airline and Denver officials, and politics.

Politics

Political issues were a surprising obstacle in the progress of the automated baggage system design and installation. George Rolf, an urban planning professor from the University of Washington, said that publicly run projects like Denver International Airport encounter problems because "you have two distinct processes going on, one political and the other technical, and they have little to do with one another." One example of this claim is Denver's refusal to award the job of operating the baggage system to BAE, the only company that well understood it. The basis of this decision revolved around political but impractical ideals. Essentially, Denver officials suspected that BAE would not hire enough minorities and women, although BAE said they would. Richard Woodbury wrote, "In the wake of political infighting over who should get the lucrative contract, it went to an outsider, Aircraft Service International of Miami, which has had to race to fathom the system in a few months." A Denver insider declared, "It was raw greed. Everyone wanted a piece of the contract moneys. The city lost control at the outset, and the project was destined to run amuck." Further political problems ran through the entire Denver International Airport construction in the presence of rhetoric and false assurances to the bond market. Some of the statements made by Denver in defense of construction delays and practices bordered the lines of legality. Mike Boyd, an analyst who heads Aviation Systems Research Corporation in Golden, Colorado said, "This is an airport built for politicians, not for airlines. When you look at the numbers and what they're telling bond houses, it is absolutely shocking. None of the significant numbers that the city has been putting out since the airport was started have held true." Other political troubles included Denver's alleged falsifying of temporary certificates of occupancy (TCOs) in the midst of the baggage system crisis to appease the airlines, and a lawsuit with the Park Hill Neighborhood Association barring a partial airport opening. Consequently, in January of 1994, both the Justice Department and the Securities and Exchange Commission subpoenaed key Denver International Airport documents. In February of 1994, the U.S. attorney's office sent investigators to Denver to interview city officials and probe into alleged wrongdoings. In August of 1994, a federal grand jury began investigating the Denver International Airport for fraudulent contracting, trading, testing, and construction financing practices. In late October of 1994, a congressional auditing agency became involved in Denver International Airport's financial woes. The General Accounting Office (GAO) reported that despite Denver's delays and losses, the city's chances of avoiding default were good.

Technologically Advanced

The BAE design is technologically advanced. According to Richard de Neufville, it is not the next generation of baggage system, it is more like a jump from third to fifth or sixth generation. Unfortunately, BAE misused its technological advantage by expecting spectacular performance from the system components, and not allowing them a proper margin of error. The components were expected to perform to their highest theoretical capabilities. Bruce Van Zandt, operations manager for the backbone communications network at Denver International Airport stated, "The system pushed the envelope of technology. The components that were put into the system were run right to the limit of what they were designed for." When any of the components failed in this respect, others failed as well due to the system's inherently tight coupling.

Planning

BAE, DiFonso said, was originally contracted by United in the fall of 1991 to build a baggage system specifically for United Airlines at the new Denver International Airport. The airline, he said, was concerned that after several years into the project, the city still had not contracted for a baggage system. Indeed, Denver's baggage system design was an afterthought to the construction of the airport. The BAE system was detailed well after construction of Denver International Airport had begun. When construction of the automated baggage system finally began, problems arose due to the constraints of the buildings and structures which would contain the baggage system's tracks and other components. Unfortunately, the system had to fit into the underground tunnels and available space given the challenging and unrelated Denver International Airport construction plans. Tight geometry resulted in additional construction difficulties. Telecars had to make unreasonably sharp turns on tracks shoehorned into corners at considerable inconvenience. According to Bernie Knill, an obvious solution to such poor planning techniques entails designing the baggage handling system with the building, and installing the system as the surrounding structure is being built.

Schedule

BAE officials said that a timetable for the opening of the airport was never realistic and should have taken potential problems into account. When asked about the ambitious timeline, one BAE official responded, "We knew that was not long enough and we said so. It's a job that ought to take twice as long." While the media hammered BAE for their role in the delays, BAE vice president of engineering Ralph Doughty voiced his frustration. "Its a 3-4 year job we were asked to do in 2 years," he said. Denver Aviation Director James C. DeLong offered the explanation, "We had a project that should have taken seven years and we tried to do it in four years. We just misjudged. We'll probably do it in five." As the project fell more and more behind, human error became a factor due to a more truncated training and testing period.

Requirements Modifications and Other Changes

When BAE accepted the job, no changes to the project were anticipated, DiFonso said. However, once BAE's work had begun, Denver officials often altered plans and timetables without consulting either the airlines or BAE. Even worse, when changes were made to one part of the system, it was not clearly understood how the changes would affect the system as a whole. To reduce its construction costs, United decided to remove an entire loop from its own ambitious design for concourse B. Rather than two complete loops of track, United wanted just one. This change shaved $20 million off the system's price, but required a complicated and untimely redesign. Other changes were made such as relocation of outside stations, addition of a mezzanine baggage platform, and Continental's request for a larger baggage link. As the project matured, it grew in size and complexity. Design changes increased the system's technical difficulties that consistently hampered progress. When BAE learned that the centralized system's faults ran through the rest of its tightly coupled subsystems, they chose to decentralize all of the tracking and sorting computers. Such major design changes deserved review of alternate courses. However, due to the condensed development and testing schedule, on the fly design changes that typically require major design alterations were treated with minor patchwork

Chaos

The first time that BAE ran the baggage system for performance testing, the resulting chaos was sobering. In March of 1994, the installation staff ran the BAE system for several media groups. Faults throughout the entire baggage system destroyed bags and flung suitcases out of telecars. The next day, phrases like "bags were literally chewed up," and "clothing and other personal belongings flying through the air" hit newspapers. Telecars jumped tracks and crashed into each other. Suitcases went flying like popcorn kernels, some of them breaking in half, spewing underwear in every direction. When the telecars crashed into one another they bent rails and disgorged clothing from suitcases. Others jammed or mysteriously failed to appear when summoned. Telecars crashed into each other especially frequently at intersections. Many dumped their baggage off at the wrong place. Some telecars became jammed by the very clothing they were carrying. As the telecars flung their bags off or ripped them open, the clothing clogged the telecar rails, halting traffic and crashing other telecars in back. Most telecars holding bags with unreadable bar codes were routed to holding stations. Other telecars that knew were they were going collided with telecars that couldn't remember.

On May 2, 1994, DiFonso addressed the situation, and stated that the system was not malfunctioning, it just hadn't been fully tested yet. BAE officials blamed the mutilation and other problems not on a defective design, but on software glitches, and mechanical failures. They found one reason for baggage mutilation involved the airport personnel. When workers placed bags on the conveyor belts upright, the system frequently jammed or shredded the bags. When the bags were placed correctly, laying flat, the performance improved. BAE found many design culprits and appropriately made changes. Slowly, BAE improved the system's general performance.

Unfortunately, in August of 1994, the system's performance was still poor. Even during planning of the alternative tug and cart baggage system, telecars continued to collide and fall off their tracks. In late August, Glen Rifkin of Forbes wrote, "Throughout the day, workers are seen unclogging tracks lined with bags that have been cut in half." Morale was low among the installation crew. When asked how the test bags were damaged, one worker replied in mock horror, "It's not eatin' bags. A truck ran over these outside."

Software

Ginger Evans, director of engineering for Denver International Airport, claimed that BAE didn't pay enough attention to the programming issues early enough in the design process. She believed that alleged troubles with building access or mechanical issues weren't the problem. "It's that the programming is not done," she said. She faults BAE for this inadequacy. Others contend that many problems of mechanical nature originated in the buggy software. According to Glenn Rifkin of Forbes, software sent out carts too early or too late. Robert L. Scheier of PC Week alleged that it was the system's software problems that resulted in the airport's 3,550 baggage telecars crashing into each other or becoming stranded along its 22 miles of track.

BAE president Gene DiFonso contested allegations of faulty software playing the central role in the system's horrific performance by stating that "Software was not the major problem. It was an electromechanical problem. The system was stutter-stepping because the electromechanical side wasn't fully up to the software's capability." However, DiFonso admitted that program code had been a nightmare at times. He revealed that the burden of writing code for establishing and maintaining communication with the airlines' reservation systems was heavy. Particularly challenging was the duty of connecting with United's Apollo reservation computers. A definite element in the disarray of the communication software was the process of language translation, since BAE's computers had to converse in the same software language as of each airline. Such translation work is painstaking and often laden with bugs.

While writing code for the communication, tracking, and other numerous applications, the software grew more complicated. As a consequence, the code completion agenda experienced the threat of becoming unmanageable due to escalating levels of complexity. By principle, as program code grows in complexity, it becomes increasingly hard to track or understand (see Complexity Of the System). Instances of systems code delaying the opening of large projects abound. For example, the English Channel Tunnel was delayed for about a year by problems with more than three million lines of code. Only adding to confusion, applications of such size typically borrow from a number of object code libraries and other resources. As Bjarne Strousoup noted in 1987, "No major program is ever written in the programming language as described in its basic language manual. Libraries of all sorts are used and often determine the structure of the program." Finding the origin of a glitch can consequently be nearly impossible. A giant project held hostage by troublesome software code and insufficient testing is the technologist's worst nightmare. When troubles arose with the Denver baggage system's complicated code, BAE programmers had to customize the software to handle each individual software related problem. This process rudely resulted in code hacking. "If the baggage handling system has all of its problems solved, it will be via hack-o-rama," wrote Larry O'Brian.

System Testing

According to John Dodge, 75 percent of all information systems projects are plagued by quality problems, and only 1 percent of the projects are completed on time. Dodge cites insufficient software testing as the most frequent culprit and describes it as "one of the thorniest client/server issues." Munich officials had advised Denver to leave plenty of time and resources for testing. At the Munich airport, where a smaller automated baggage system sorts baggage, engineers spent two years testing the system. In addition, the system was up and running 24 hours a day for six months before the airport even opened. The Munich officials said that the Denver staff did not heed their advice. Although BAE had tried to leave sufficient time for testing, they were constrained by their promises of a quick pace in developing the system. Moreover, troubleshooting the maze of software was a slow process. According to DiFonso himself, "Underestimating the time required to discover problems, fix them, and retest," was the main reason for the opening delays.

Testing the system's mechanical side was unsuccessful. One source of frustration involved radio communication between testers throughout the underground tunnels, concourses, and control rooms. Engineers using radio communication in the concourses couldn't talk to their colleagues during testing because of dead spots in radio transmission around the airport. Testing proved to be difficult and more time consuming than BAE anticipated. BAE's employees worked around the clock, rarely surfacing for air from the bowels of the system, as one BAE manager remarked. In September of 1994, BAE's parent company, BTR Plc. of London, brought in the British-based PA Consulting to help debug the system. In addition, BTR executives themselves began spending time in Denver working on the BAE design. The influx of engineers, programmers, managers, and analysts improved the pace of testing. According to Glenn Rifkin, that month, the 110 BAE employees got their first week off in two years.

Timing

Before timing problems were known, United Airlines ticket agents were generating on-line printed baggage tags too quickly. The timing gap led United's Apollo computer reservation system to communicate erroneous data to BAE's sorting computers, causing the baggage telecars to go to a manual sorting station, and not their proper destinations. The solution involved slowing the ticket agents' actions through additional training.

BAE altered system speeds when officials discovered significant timing problems in matching telecar and baggage arrivals as well. Denver Post staff writer Mark Eddy believed that BAE had to regulate more closely the speed of the telecars themselves. To ensure that bags would land in telecars, not ahead or behind them, BAE engineers revised telecar and baggage merge timing, and improved clutch brake reliability. Telecar speeds were smoothed by moving motor locations, adding magnets to tracks, and adjusting magnet gaps. To further improve accuracy in telecar and baggage merging, the release of empty cars from storage areas was tailored to better match demands. BAE constructed a new model, and changed to a new telecar reservation process. Adding redundant controllers to the baggage to telecar loader reduced misalignments and timing gaps. The system's general reliability was additionally improved by exercising time-critical elements each morning to warm the system's components.

Equipment

Some critics cite BAE's equipment choices as factors of the system's failure. Regarding the distributed 486-based PCs, Carl B. Marback states that, "when you combine DOS' quirks (my DOS PC still crashes regularly) and the uncertainty of PC software (I get lots that doesn't work) with third-party things like Novell and network hardware, where is the 'managing vendor' to sort it all out?" As he predicted, the computers became overwhelmed when tracking thousands of telecars in transit. This led to the system redesign called for by both the airlines and Denver. The new design reduced the system's complexity and far reach, and successfully bailed the computers out of their terrific workload.

Early in testing, laser scanning equipment that misread bar codes became a major problem. This was clearly a product of deficient planning, since anyone who has watched the checkout clerks in a grocery store with laser scanning devices has seen that they sometimes make mistakes. Continental had first experienced such problems with the system's poorly printed baggage tags when their laser scanners rejected about 70 percent of the tags, and sent the telecars to the manual sorting station. BAE found that part of the problem involved the baggage tag printers producing poorly printed bar codes that were easily misread. When the tags were reprinted clearly, the system only rejected 5 percent of the tags. Other difficulties in lasers reading bar codes occurred when airport workers erecting walls sometimes knocked laser scanners out of aim. BAE resolved some scanning difficulties by installing redundant laser scanners. Unfortunately, in BAE's case, it was difficult to pinpoint every manifestation of laser scanning error due to the number of possibilities inherent in the system. For example, when a laser scanning error occurred, it was possible that the baggage handler had placed the bag on the conveyor belt with the bar code tag hidden, or the bag may have had tags from earlier flights in view. The tag also may have been dirty or out of the field of view or focus of the laser scanner. Therefore, the complicated problems were laboriously dealt with one by one.

The scanning problem was compounded by the telecar to computer communication process. Even when the bar codes were successfully read by the laser scanners, the bar coded information was transmitted by radios on each of the telecars. This added a second opportunity for error, and decreased the reliability of the system in general. This can be expected since the reliability of two devices working accurately together is roughly the multiplication of their individual reliability, which is always less than either device alone. Conversely, if two devices are made to perform the same task, the built-in redundancy improves the combined reliability of both devices. This is an important principle, since the Logplan report made it clear that there was not enough redundancy to satisfy the system's reliability needs. Soon after Logplan's report completed, Denver decided to install the alternative tug and cart system for added redundancy.

When telecars that eluded the scanning and transmitting problems engaged in transit, other problems occurred. Some glitches in photocell quality and placement caused the tracking computers to mistakenly presume there was a telecar jam. To solve the problem, BAE reviewed the design and made sure that the motors and photo electric eyes were located where the computer thought they were. BAE added redundant photocells, and enlarged their diameter so they could 'see' more. Some photocells that couldn't detect cars going by were found coated with dirt or knocked out of alignment. The painting crews that had covered up some electronic eyes with paint went back and scraped them clean. Bumpers on the telecars had also been interfering with the photocells' tracking process, so BAE workers adjusted each bumper on all 3,550 cars.

Faulty latches were blamed for causing telecars to dump luggage on their tracks or becoming jammed against the side of a tunnel. When each of the car's latches was modified, the obstructions subsided. Another problem involved airflow flipping light or empty suitcases out of their telecars. To reduce the likelihood of this occurrence and to better understand the system's aerodynamics, BAE engineers pressure mapped the telecars in a full-size wind tunnel.

Some parts of the system required that telecars negotiated sharp turns and other abrupt conditions. Where high-stress areas of track frequently broke or bent, BAE added reinforcements for increased strength.

Power Generation

For some time, the system was experiencing unreliable power generation and electrical surges that no engineer could trace. "Even the electrical engineers don't understand completely what's going on," said Jay Button, BAE sales manager. The power surges tripped breakers on some of the system's 10,000 motors. Sometimes, the airport's erratic power generation shut down the system totally.

During detailed electrical tests, electrical power feed systems fluctuated, causing the surges that disrupted the system's operation. To solve the problem, BAE installed a series of special industrial power filters to smooth the flow of power.

Line-Balancing

To understand how a typical line-balancing problem can cause delays and inconsistent performance, think of the times that you missed a bus because it was so crowded with people that had boarded at earlier stops, that you were left behind waiting. Line-balancing problems are common and well known to many systems designers. Furthermore, just as with every other design issue, line-balancing solutions obey the law of complexity. The difficulty in solving such problems increases exponentially with the number of lines or cues in the system. The BAE system has hundreds of such cues. To gain perspective on the difficulty of understanding line-balancing, note the example of Atlanta airport's interior transit system. In this case, the problem involved the people mover between the five passenger buildings and was the subject of a doctoral dissertation at MIT (Daskin, 1978.) This was a two year long intensive effort on a system much less complicated than the BAE design. Ironically, the line-balancing problem is sometimes compounded by a general ignorance or disregard for its existence. BAE engineers seem to have discovered the line-balancing problem about six months after the intended airport opening date. A site manager giving a tour of the BAE system in July of 1994 explained the line-balancing problem and described it as a novel phenomenon that they had just started to work on!

BAE president Gene DiFonso revealed the system's line-balancing troubles during a tour in late February of 1995. "We had bags lined up and waiting for vehicles and empty vehicles going by with no bags," he said. "The problem was that we assumed we could release empty vehicles in some arbitrary quantity. Sometimes that number coincided with the number of bags waiting, but sometimes it didn't." Empty cars that were needed and summoned ended up instead being routed to waiting pens. Late in the testing period, the BAE staff finally curbed the system's dispatching problems. The solution came when programmers wrote new line-balancing related logic for both the OS/2 based car routing application and the PLCs that carry out the commands

Complexity

Admitting their ambition, Ralph Doughty stated, "We've done car-based systems before, but never this large." The project's size and comprehensive nature caused it to experience a many problems due to complexity. This is predictable when considering complexity theory (see "Complexity Of the System.") Typically, systems with more than 10,000 function points are canceled 65 percent of the time, according to Capers Jones. In Denver, the system's terrific workloads bogged down the network of distributed computers that track luggage on the 3,550 telecars. Computers were tracking so many telecars that they mistracked at times due to strict timing limitations. United believed that the tremendous workloads warranted drastically reducing the system's complexity. To begin reducing the complexity, Denver decided to completely cancel concourse A's automation design. The tracks and machinery serving concourse C were redirected to concourse B as well. The number of destinations in the system went down by a third when only one of the three concourses remained in the design. The number of destinations decreased by an additional third when Denver decided to consider only outbound traffic on the remaining baggage loop. Denver cut the system's track capacity rate from 60 to 30 cars per minute, when United argued that the computers needed to take more time to avoid mistakes. Along with the earlier changes, cutting the rate of sorting on each track caused the overall system complexity to shrink by a full order of magnitude. Unfortunately, the concept of a fully automated, high speed airport-wide baggage system deteriorated to a less complete system with drastically reduced complexity, speed, capacity, performance, and efficiency. This new system, however, worked well enough to open the airport.

Logplan

Denver conducted a worldwide search for consultants who could figure out exactly what is wrong and how long it would take to fix. Unfortunately, this was something that neither the city nor BAE could predict. Logplan, a German consulting company was hired for the job. Logplan had recently demonstrated its skills by performing similar troubleshooting and systems integration on the baggage system in Frankfurt. Denver and United then used Logplan's final report in deciding how to make the pieces of their system work.