020. Innovation versus Shipping: The Cairo Project
Windows Cairo is the second generation of Windows NT…the first Microsoft operating system that realizes the “Information at Your Fingertips” vision. –Windows Cairo product planning document
Back to 019. BillG the Manager
As technical assistant I spent most of my time navigating our operating system strategy and progress during late-1992 to mid-1994. There were three main OS development projects going on at the time. Chicago was the code name of the successor to Windows 3.1 (shipped April 1992), rooted in the MS-DOS architecture and trying to build up from there. Windows NT building a portable, secure, and robust operating system from scratch aiming for the workstation and server market (version 3.5 to ship September 1994). These were both products under development unified by the Win32 API strategy announced at the professional developer conference. Cairo was a new project built on the core parts of NT but innovating (and inventing) in most every possible dimension. An entire book could be written about any one of these projects, but all were happening at once. This post is about what that was like. I don’t take this lightly when I say this, even after all these years many people I know still have emotional reactions to this period of time and the traumatic experiences of this project and how played out.
It wasn’t like Microsoft’s operating system strategy was ever simple, at least to me. Perhaps it was asking too much for a cleaner or more straight forward strategy to emerge with the move away from OS/2 and the early success of Windows 3.0, and now 3.1. Being complacent or content was not in Microsoft’s DNA—a bold vision for Windows was, however, and with that came even greater product complexity.
Microsoft had a simple external message of “Win32.” The problem was the product had not yet caught up to that message. Windows NT was just shipping its first version while the market was predicting NT would soon dominate the desktop. Microsoft was also anxious for that, first almost always talking about Windows NT after selling Windows 3.1. The “real” 32-bits, advanced networking. client-server developer strategy were all great selling points, but Windows NT was a new code base and lacked compatibility with huge numbers of applications and devices that represented the richness and key strategic value of the 16-bit Windows (and MS-DOS) ecosystem. Beyond that, Windows NT had the capability to run on non-Intel microprocessors which only fueled more punditry over the future of the PC. This left many believing the operating system market still seemed up for grabs and buying a PC remained a complex decision.
In the meantime, Microsoft had fallen woefully behind Apple Macintosh when it came to ease of use and the day-to-day effort required to keep a PC working—not behind in sales, but that was not what counted for currency with BillG. Microsoft had yet to release a simple files and folders experience that matched Macintosh and was still mired in the vestiges of MS-DOS, such as file names restricted to “8.3” (eight characters plus three for the file type). It is a cliché even today, but Macintosh just seemed to work, and PCs always seemed to be crashing, hanging, or flaking out, or just much more difficult to use. Just as PCs became common in schools and the workplace, “the dog ate my homework” was replaced with “my PC crashed and ate my work” or something like that.
Microsoft’s core Windows project for consumers was Chicago (eventually Windows 95). Chicago would bring the compatibility and ecosystem support enjoyed by Windows 3.1 together with the new Win32 API, while at the same time addressing ease of use shortcomings of Windows compared to Macintosh. Chicago had the goal of being a PC that was better than Macintosh plus bringing with it all the benefits of Windows that had cemented leadership in the market. The project was still early enough that most attention was on the just released and bolder Windows NT, primarily because so many believed that the 16-bit heritage of Chicago was a fragile legacy code base ill-suited for the modern 32-bit world. Microsoft’s own efforts around marketing NT only emphasized this point.
For any company that would be enough of a big bet, not for Microsoft or BillG though. Chicago was just one part of an all-out assault on the operating system market, one Microsoft already dominated: Chicago for consumers, Windows NT Workstation for professionals, Windows NT Server for the back office, along with numerous early-stage efforts on both living room and handheld computing devices going on in NathanM’s advanced technology group. These all came about as a direct reflection of BillG’s scalable Windows strategy best expressed by a slide from the Win32 Professional Developers Conference showing one Windows scaling from the smallest devices to the biggest computers—a slide that would in some form carry Microsoft’s vision for the remainder of Bill’s leadership.
Then there was Cairo. Whereas the major axis that defined everything along the scalable strategy was simply how much computer horsepower a device had, Cairo set out to redefine how people interacted with computers and how developers wrote programs. Cairo was to be a new paradigm from user-interface, to data storage, to networking computers together.
In an era where computers hardly worked and every developer at Microsoft was struggling to figure out how to write reliable code, ship that code, and meet a schedule, Cairo was by any measure an audacious bet, and that is probably an understatement.
Exactly where Cairo fit in and how, and even if that was possible, would occupy a huge amount of Bill’s time and thus my time. Given how I had just navigated the operating system strategy to do my little part to ship tools, I was fortunate to be well-versed in the technology and the teams. But where I found myself was in the awkward and impossible spot of having to help evaluate the practical realities of shipping for a CEO who wasn’t generally focused on those aspects of projects.
Landing on my desk early in 1993 was the first of many drafts of Cairo plans and documents. Cairo took the maturity of the NT product process—heavy on documentation and architectural planning—and amped it up. Like a well-oiled machine, the Cairo team was in short order producing reams of documents assembled into three-inch binders detailing all the initiatives of the product. Whenever I would meet with people from Cairo, they would exude confidence in planning and their processes. The confidence took on such a level that people began to refer to Cairo unofficially as the updated version 4.0 of Windows NT.
At a college recruiting trip at Cornell, I remember spending an evening at the Statler Hotel bar with one member of the NT team and one member of the Cairo team (both fellow Cornell graduates) debating over schedules. Would Cairo be NT 4.0? Would NT 4 beat Chicago to market? Would Chicago be dead on arrival because of Cairo or its MS-DOS legacy? Or would “real” NT 4.0 beat Cairo to market? This was engineer bravado at its best. It was also Microsoft’s operating system roadmap at its worst.
While any observer should have rightfully taken the abundance of documentation and confidence of the team as a positive sign, the lack of working code and ever-expanding product definition seemed to set off some minor alarms, especially with the Apps person in me. While the Cairo product had the appearance of the NT project in documentation, it seemed to lack the daily rigorous builds, ongoing performance and benchmarking, and quality and compatibility testing. There was a more insidious dynamic, and one that would prove a caution to many future products across the company but operating systems in particular.
Technology was moving very fast and new products were appearing across the industry at a rapid pace. As a brand-new product under development, it was tempting to look at every new development and wonder how it might be part of what is being built. This is especially true for an operating system which tends to lack any traditional product boundary like one might see in a word processor or spreadsheet. What is an operating system after all? Purists might say it is a kernel, but then what about the graphical components? Others would be quick to point out that networking or storage are not always considered part of an OS except at some basic level.
Cairo tended to take this as a challenge to incorporate more and more capabilities. New things that would come along would be quickly added to the list of potential features in the product. Worse, something that BillG might see conceptually related, like an application from a third party for searching across all the files on your hard disk, might become a competitive feature to Cairo. Or more commonly “Can’t Cairo just do this with a little extra work?” and then that little extra work was part of the revised product plans.
Along with this BillG reinforcing function of feature additions, there was the internal dynamic between the three major operating systems teams. Each team navigating the external competitive landscape, the ongoing BillG input, and a desire internally to be seen as both the leading OS and the one that will ship first and ultimately “win”. The idea of being first to market turns out to be a compelling way to measure success. This was especially interesting in a world of fluid or even non-deterministic ship dates where there were few absolute dates for shipping but a plethora of relative milestones. Who had a beta first? Who had a preview before that? Which product would get sent to OEMs for review before that? When was the next PDC and what code would be distributed there?
This led to a rise in one of the more classic Microspeak expressions, as we called them, or jargon as it is called elsewhere. In our little Seattle area bubble, disconnected from most of the world and not yet connected by the internet, Microsoft developed a vocabulary that to this day dominates discussions between alumni. Cookie licking is when one group would lay claim to innovate in an area by simply pre-emptively announcing (via slides in some deck at some meeting) ownership of an initiative. Like so many expressions this one seemed rooted in something long lost, but the basic idea is that teams wanted to keep features to themselves by declaration or fiat, almost always independent of a schedule, resources, design, or any concrete steps. Cairo by its own efforts and, frankly, by Bill doing his share of pushing features to them, licked a lot of cookies. Even calling Cairo NT 4.0 out of the gate was cookie licking as a high art form. The team was hardly alone. Other parts of the OS landscape would take the grand ideas of Cairo and lay claim to much more pedestrian implementations and state they would deliver the innovation sooner and more practically, with the caveat there were future plans (slides) to deliver the rest of the vision.
I was often caught in the middle of these debates. Who was going to deliver what and when were the questions of the day for nearly everything that came up in every discussion about Microsoft’s next operating system. The larger than life leaders of these projects intimidated me, at least early on. I decided on a very practical approach which was I just bought a lot of hardware and installed a lot of daily builds and let the code speak. It was what JeffH had taught me about shipping and it was the easiest way to prove or disprove what was going on. Windows NT was by this point very solid and building out on the promises of the workstation and data center, with many developers running it as their primary work computers. Chicago was just starting to deliver builds and you could experience significant changes in the user interface – files, folders, long file names, and earliest form of what would become the Start menu. Chicago followed a series of scheduled milestones M1, M2, M3, and then M4 which was the first build that made it to the outside world and was also usable on a daily basis for the incredibly brave (like me). I remember showing it to BillG when he commented on how “Chicago seems to be marching along like a British highway system, M1, M2, M3, M4”. I’m not sure why that comment stuck with me. Maybe he thought it was super funny.
Cairo was a different story.
Cairo was announced and demonstrated at the December 1993 PDC, but no code was provided. With that came almost impossible to describe internal tensions and angst. While there was always tension between OS/2 and Windows, the skunkworks nature of Windows and the outside forces of IBM proved ample outlets for frustration. With Cairo, everything going on internally was self-inflicted. At every level of the organization and across the product teams, the constant back-and-forth between Chicago, Cairo, and the next NT (NT did not lack for codenames at this point, going by the moniker Daytona, a nod to the efforts to improve speed and the affinity for fast cars among the leaders of the team). Pick any two and there was an ongoing knock-down, drag out battle over schedules, performance, architecture, or user-interface. The Apps group, third party software developers, and the hardware ecosystem were all caught in the middle.
Chicago was a big team. NT was an even bigger team. Cairo quickly grew to be even larger. For those looking for reasons to see the potential for failure, ever-increasing team size was a good proxy. Frankly, the divergence of documentation and slides from the daily builds was an even bigger indicator. That was the factor I focused on. I often had to pull Bill back from reading about what was being developed to see what was actually in code and at what pace that was changing.
The only saving grace was the steadfast and relentless evangelism of the Windows API and the Win32 vision. That held the company together as a practical matter for the time and the next decade.
In my own small way, I lived through a variant of this vicariously through my lunchroom friends working on Word years earlier. After the debacle that was Windows Word 1.0 (if you can call winning a debacle), a project was started to build a new more robust and refined, a modern, code base for word processing. The Pyramid project went on for a couple of years before the realization that the existing code could be made to work fine and new code brought with it new problems. It was quietly and quickly shelved. The tension and confusion were real and ongoing.
IBM was famous for having competing projects and many in technology thought companies should build new products with multiple efforts, in some sort of coding Darwinism. Maybe it had worked before, but the human and customer costs seem out of proportion. It is one of those business school ideas that looks great on paper. I probably did not need more proof that I was living through a case study in the making.
If meetings and my TA efforts with Chicago were focused on the relatively narrow or mundane topics of performance and the number of bits in use in the kernel (should Chicago be 16 or 32-bits and in which subsystems was a major ongoing point of consternation), Cairo was expansive. Cairo, like Chicago, had a new shell (Microsoft’s favorite word for the user interface for launching programs and managing files) and a new file system, but the innovations were to be radically different. Where Chicago aimed to commercialize broadly the graphical operating system, a concept understood by most, the goal of Cairo was to commercialize two of the biggest buzzwords in computing: object-oriented and distributed networking.
Cairo aimed to advance personal computing with dramatic changes in how we thought of files—rather than single files and folders, Cairo intended for files to have the capabilities of a database. Everything on your PC was to be stored in a database to easily search, find, and show relationships between items: files, email, contacts, photos, documents and more. Advancing storage was a long arc of innovation Bill favored.
The graphical interface for manipulating these objects had elements of traditional files and folders but enabled more operations. A folder, for example, might not have anything in it until a user indicated the folder should contain “all objects from 1992.” The folder would be filled as though everything matching that description had been copied to that folder, but it did so without making copies of the files. It seemed slick at the time.
The object-oriented nature of Cairo was not just dreamed up but paralleled several efforts across the industry (some even going as far back as Xerox PARC research work). Specifically, Apple was building a system called OpenDoc that promised to bring object-oriented files to Macintosh. It would never make it to market though. IBM had a project known as system object model (SOM), which aimed to bring objects to every size IBM computer, from PCs to mainframes. It too would fail to materialize. All this object-oriented stuff was developing a pattern.
Object-oriented storage would have been impressive if it all happened on a single PC. The true magic of the promise of Cairo was that everything that took place on one PC could work across networked PCs. The notion of a network was still new, and the first web browser was just being released while we were busy building what some might call a web-like system. JimAll (leading Cairo) was a pioneer in distributed computing, inventing a system called Clouds for his PhD dissertation.
The Windows NT team was also steeped in distributed computing, having seen the Digital Equipment VMS operating system gain distributed capabilities a few years earlier. Similarly, the distributed capabilities of Sun computers running Unix and NeXT from Steve Jobs, were gaining popularity on Wall Street and in academia.
The biggest impact Cairo had on Microsoft’s technology direction was in the adoption of a technology called component object model (COM). COM started in part of my original team, Apps Tools, as a way for productivity applications to talk to each other, the earliest work invented by the PowerPoint team to make it easy for PowerPoint to include charts from Excel or pictures from other apps. The Cairo architecture used COM for every aspect of the system—it was object-oriented at every layer of the system. In fairness, I am intentionally glossing over a good deal of complex technology history about COM and technologies included under this umbrella including Object Linking and Embedding (OLE), Automation, and DocFiles.
This seemed like a good bet, but then as the system started to mature it became clear that being so object-oriented had downsides when it came to performance and even managing all the code in the system. As it turned out, those building operating systems were equally susceptible to oopaholism.
At one point, things became so fragile that JimAll asked for a meeting with BillG to discuss the future use of COM, questioning even moving forward to perhaps reset and find a more robust approach.
I quickly pulled together all the background and made a list of pros and cons. My own history on C++ came in handy as we had gone through our own education and reform as oopaholics. I asked my old manager, ScottRa, to attend this small meeting to detail his experience with COM and objects. I was definitely on the side of abandoning COM, having seen the cost of being excessively object-oriented.
The meeting took place during the Software Development Conference when we were launching Visual C++. I booked a flight for a day trip, something not usually done, and arrived at the meeting just in time (Bill questioned my judgement of a one-day turnaround). The meeting was a first and one of the few times I saw a clear choice and decision being made. I mean this in a good way because most often meetings that claim to decide things, really don’t. Bill was well aware of that and used meetings to arrive at consensus rather than force choices that would need to be revisited at some later date.
There was a good discussion for quite a long meeting, and ultimately JimAll decided to stick with COM, but there was a commitment to be sure to use it at higher levels of the system and to avoid oopaholism. In other words, the OS itself would limit using COM and objects while pushing the use of that technology to developers. This seemed practical but the “do as we say, not as we do” aspect of this proved to be problematic for a long time.
COM went on to become an architectural anchor, like on a boat, for nearly everything else Microsoft did for decades. I often think about this particular meeting—the stakes seemed so high, though had an alternate decision been made, things may not have ended up all that differently. COM was an anchor, that was true, but the value was so much higher level and in so many other parts of the system. COM had all the architecture, complexity, and proprietary elements that the company seemed to be craving at the time.
The number of companies with projects potentially competing with Cairo continued to increase, which only caused the scope of Cairo to broaden. At the same time, the value that Cairo provided to other teams made the effort worthwhile—pushing the Chicago shell to have a leading-edge design, encouraging the Database team to think more about storing different types of data, and creating the precursors to networking advances that drove the client-server computing revolution. Some would say the influence of Cairo is less reality and more putting a shine on the ultimate failure of the project. Perhaps that was the case at the time. Does it matter?
Ultimately, the human toll of Cairo was high in the sense that so many people spent so much time early in career working on a project that not only didn’t ship but was viewed as squandering resources, at best, and misguided at worst. It was a bit of a black eye for Microsoft among the press and analysts who believed Microsoft would deliver on the idea of object-oriented the way that NeXT had done but at scale. The magnitude of the project would leave many people with Cairo war stories for years to come. I wish I could say that the lessons learned would prevent another experience like this from happening, but that isn’t the case as we shall see.
The success Microsoft was having with Windows and the failure of competitors to do better in the market created an environment where even mistakes as significant as this seemed not to slow things down. Importantly, and I think this is a good thing, it did not cause Microsoft to back off audacious goals and big visions for technology. It is easy to see a world where a setback like this would force Microsoft to reconsider big bets and to aim for less lofty goals. I am glad that did not happen, as difficult as the next years would be.
The entire time I worked to amplify Bill’s efforts at steering Cairo, I felt caught in the middle between innovation and shipping, not that those were mutually exclusive. To the contrary, I naturally tilted toward shipping and believing that was innovation. The dichotomy between shipping groups and non-shipping groups was too often portrayed as a dichotomy between execution or innovation. That proved to be the root of the feeling that every group was screwed up—the innovative groups weren’t shipping enough, and the shipping groups weren’t innovating enough. Sometimes I felt that groups that were shipping were almost never given the benefit of the innovation moniker, and groups that were innovating seemed to be unburdened by shipping. Perhaps this was the Microsoft way of having competing groups.
On to 021. Expanding Breadth versus Coherency: The EMS Project
Reading about Cairo is triggering Longhorn PTSD in me. Finally having some understanding of Cairo means I am now even more astonished at the mistakes made during Longhorn.
Cairo can be a whole book by itself and a most fascinating one if it covers not just the technical topics but also human ones.