086. The Memo (Part 2)

Agility = Execution + Impact — one of several one-liners I would employ in the process of articulating how the team could improve

Jun 19, 2022

Article voiceover

1×

0:00

-43:58

The previous section detailed the raw observations on Windows and Services culture I saw after weeks of hearing about the situation from as many people as I could. I could not just put that out there without specifics of what I thought could improve. I had to put some structure on what I learned and to offer optimism and aspirations.

Back to 085. The Memo (Part 1)

Reflecting on this moment of both optimism and fear, today I look at the candor I expressed with a bit of amazement. I wrote with detail and assertiveness yet seemed to forget that I was writing about one of the most successful businesses ever created. I was writing about hundreds of billions of dollars of market capitalization. I was writing about many friends. At the same time, there was so much that needed to be improved or more specifically to be repaired. I think what really motivated me were all those 1:1s I did and hearing all the different people expressing their pain and troubles, knowing things could be better. This was not a team that was dug in ready to resist change. It was a team waiting for change. It just needed to be the right kind of change.

That reality made this much easier. I felt if I could document what was going wrong and the broad population agreed then I was on a path to addressing challenges. If I could articulate reasonable aspirational goals, then what remained was to build a product plan on that rebuilt foundation of trust in management.

I was quite worried that both the problems described and the aspirations I would document would seem cliché. With BillG in particular, over the years he had shown little patience for the broad topic of management. His world view was always that the business would be best served by taking on the most difficult technical problems and developers would be anxious to tackle such challenges. That recipe propelled Microsoft for twenty years of Windows but was failing us now. SteveB was never one for patience and while he would be receptive to these management challenges, he was far more anxious about a plan and the timeline for the next product to address the concerns that were mounting about Vista—the company hung in the balance. KevinJo had just orchestrated a massive restructuring of the global sales force before taking over most of product development. He was deeply in sync with the idea of identifying organizational problems then directly addressing those.

The memo, Observations, Aspirations, and Directions for Windows and Windows Live, proposed three main areas to address: decision-making, agile execution, and discipline excellence. Each was presented in a section with both observations and aspirations.

These points will sound like random musings from any generic book on management, both at the time I wrote them and reflecting on them today. The lesson learned, using the phrase from the previous section, is to demonstrate that these are more than clichés by citing specific examples that resonate with the employees who are being asked to operate differently and specifics on exactly how we will achieve aspirations.

Decision-Making

Across all of Microsoft, “decision-making” had been a constant and nagging issue. We discussed it after every MS Poll (the yearly survey of employee attitude and feedback), and each year I was left puzzled. It had never been an issue for our team (in the MS Poll and other feedback channels). I didn’t understand what was so difficult about decisions. We made decisions all the time in Office, so many it wasn’t even clear to me what decisions were so difficult.

Then I arrived in the Windows hallway.

There, it was an endless discussion of who “owned” a decision or who was “accountable” and, worse, people were asking me what “model” I used to make decisions. This was a reference to classic models of business function (or more aptly dysfunction) that use tools known as a responsibility assignment matrix (RAM, one such tool) for decision-making. One labels participants as: Responsible, Accountable, Consulted, and Informed (or RACI). Another such tool, OARP, stands for Owner, Approval, Responsible, Participant. These tools consistently proved frustrating and there was little evidence that decisions were made with less effort or more importantly, more staying power or higher quality.

The use of these tools arose as a defense mechanism against executives and managers who were prone to swoop and poop, a metaphor I learned as I assimilated into the team. As with birds, many managers seemed to show up at inopportune times, issue a quick opinion or edict, and not stick around for the mess they left behind. How much of this was actually a mess or simply a reaction to executive authority or inability to influence decision-making would take time to untangle. The expectation for me as a new Windows executive was that I had a tendency to employ this technique, no matter my own personal history or approach.

What I knew already to be the case turned out to be a big part of the problem, and that was a culture of escalation. In all software projects at scale, it is always the case that one team depends on another team—to provide code, consume code, integrate things together, and more. And this extends to sales and marketing connections. In Windows, escalation seemed to be the way most situations between teams were handled. It was a culture in which nothing was decided until people got in front of a VP, resulting in a culture where most of the middle management layer was biding time. And did I mention there were 7 or 10 management layers in Windows?

Weening the team off the culture of escalation proved to be one of the bigger cultural transformations I needed to make. It was also the root of challenges over many years of work between Windows and Office. In a culture of escalation, decisions made by people on the front lines, so to speak, rarely stick. In fact, escalation was done expressly to reverse lower-level decisions. If a Windows partner or collaborator doesn’t like the situation then an escalation ensues, and rarely did things stay the same. Office loathed escalation. Decisions were pushed down and stayed down. When people tried to escalate, they were told to work it out. The result was that when Windows tried to escalate decisions in working with Office they rarely got overturned, which proved enormously frustrating to Windows. And when Office tried to make plans, they would often find them upended at executive escalation meetings with Windows.

In Office, escalations happened so infrequently I cannot even recall any specific instances. In fact, we in Office had a saying that “escalation is failure.”

The primary downside of escalation is the way it shifts accountability. The winning side in an escalation feels an accountability not to the decision, but to winning the process of escalation. The losing side (and yes, they feel like they lost) does not feel bought into the decision and if it does go wrong, they too will join in the chorus of pushing blame up the chain. This is exact opposite of what you want to happen in times of making difficult decisions. The second-order effect is the obvious problem that as decisions are pushed up the chain there is less detail on execution related issues available, and usually that is where the trouble starts. These downsides were especially acute in mid-2000s Microsoft which was so much about improving accountability.

Given the crisis situation I was facing it would have been trivial to declare some sort of emergency and take control like a field general in a losing battle situation. Not only would this have been straightforward and arguably predictable, but it would also have fed right into the dysfunction that was already present. Sucking up all the difficulty would have been another management cliché, but one I could avoid.

Changing the culture surrounding escalation was going to be tricky. I decided to focus on consensus as a core tool for decision-making. I had to work hard to help people to understand that consensus was not the same as design by committee or groupthink—I’d always seen these as distinct when operating well. I also felt it was important to remove two tools that were all too frequently used to avoid either committing to consensus or collaborating, or to create dependencies between teams: Agree to disagree and Non-goals.

Agree to disagree gave each team the freedom to act how they would have acted prior to coming together to reach agreement, while also getting credit somehow for trying. The result was a product design/development conflict, which would fester through the development cycle and later be a customer problem.

Non-goals offer up a list of all the things a project won’t accomplish, which at first seems helpful. I never understood how such a list could be finite since there was an infinitely long list of features and ideas not in the plan. In practice, it became a way to kneecap executive input on potential collaborations or connections to other parts of the product by simply stating them as non-goals up front.

Executive presentations often began with a slide stating non-goals. Such presentations often ground to a halt debating non-goals as a result. Often the non-goals ended up looking as though the team would get nothing done at all. There’s a general rule to follow which is never offer negative goals up front. Early readers of Hardcore Software might recall the story from Office 97 when I spent weeks unraveling the damage done when the routine status report included a very long list of feature cuts, but no indication of what we were actually delivering. Bad idea.

My new team had an over-reliance on metrics and process as a mechanism to drive or force agreement on issues. This was exhibited by the constant drumbeat of red-yellow-green scorecard reviews in Windows or the KPI process in Services. These processes took an enormous amount of energy while also creating a sense of disempowerment in the organization.

The execution of these practices was fundamentally flawed. In both Services and Windows, the culture developed around having a policing team derive and measure the results, which only created an us versus them dynamic. As one example, in Services the small product planning staff organization actually believed it had the job of “determining the work that needs to get done by 800 FTEs” (a quote from a planning manager). Yet most of the debate and discussion took place around “are these the right metrics” or “are we measuring this correctly” as expected. And when the organization wanted to do something, but the metrics did not all point in the right direction the org still moved ahead. The result undermined the entire KPI process.

I offered a specific example that was going on in real-time. The team was deciding whether to turn on a new HotMail user interface for all users. The KPIs established by the planning team clearly said not to do this, but the engineering team needed to for testing and scale. Thus began a discussion over “maybe it is OK to meet 2 of 5 KPIs” or “perhaps we should weight the KPIs.” All the while when you think about it, this is an engineering organization, and it was unable to determine if the software was ready then that was some seriously deep trouble—and this was the largest flagship service.

These common techniques—decision-making frameworks, non-goals, agree to disagree, and metrics—were too often employed in forums for deciding and seemed to have the exact opposite result. These were, however, just obvious signs of poor decision-making. It was apparent we could improve everything about deciding if I personally modeled behavior and worked from the top-down to change the culture. With that in mind, the aspirations with respect to decision-making included the following (summarized from the original here):

Consensus among engineering peers. To avoid escalation, the team needed to arrive at a culture where the experts in the code (no matter who their least common manager is) can together reach a consensus on what to do. Once a decision about what to do in code (design or test) brings in general management we have reached a failure point.
Consensus among disciplines. A significant issue in decision-making was the failure for executive management to provide a framework for decisions. I saw too often that poor choices were the result of discipline silos or unsolvable situations. For example, executives pushing on development for a certain date while pushing on program management for more features while not giving testing enough time. To counter this, I offered an aspiration of reaching consensus across engineering disciplines before escalating while also committing to providing frameworks that allow for the unsolvable problems (schedule versus features for example) to be solved.
Agreeing to disagree is failure. Too many decisions were actually never made. A key example I came across early was the big bet in Longhorn on Avalon (what became Windows Presentation Foundation.) WPF was shelved for a future release, but development continued. Yet at the same time to ship Vista the use of WPF (or its precursor, managed code from the .NET framework) were specifically precluded from shipping in Vista. In other words, on the one hand a big bet did not pay off and was effectively put on hold yet on the other hand it was banned from inclusion in the product. This was a prime example of the kind of non-decision that was made at a time when the team desperately wanted and needed clarity. It would be only a matter of time until the Avalon team would just assume they would be part of Windows again and yet the whole Windows team that was shipping was making sure to never use the technology. Agree to disagree was a huge failure point.

Aspiration Having spent too long defining the problem space, the solution space is actually much simpler. In a sense if we can create a climate where people can make decisions because we have a clear framework and use that climate to enforce a sense of what discipline is responsible for what (program management owns the feature set selection and spec, development owns the code and architecture for that feature set, and test owns validating dev and pm) then I believe we can make progress growing the team. This is a long term problem and there is no overnight or quick fix. I worry that Google and Yahoo! are well ahead of us because they are growing engineering leaders the way we did in the mid 1990's and I am not sure those leaders are motivated by being PUMs. I also worry that Windows has a complex engineering task that is not viewed as attractive and thus hiring and retaining is going to be very difficult. The aspirations for our organization include the following: We aspire to have a development, program management, and testing be the core resource unit in the organization. We will be clear about how the organization is structured, run, and evaluated that this is where the work gets done and used this structure as a way to manage the work. We will balance the perspectives of each discipline properly throughout the development process. We will focus on consistency of promotion, depth of excellence, career ladders within these disciplines across the organization (starting by doing this in services separate from Windows). We will articulate the role of shared disciplines such a product design, usability, business development, user assistance, operations, and product planning. These are not forgotten or less important. They have a seat at the table as defined by the discipline manager who determines the engagement for these disciplines- this is critical because we will not staff these so that every triad of dev, test and PM gets resources for each of these. We will restructure the team significantly reducing the number of pure management roles that oversee multiple disciplines. This will be discussed later in this memo as removing this vestigial aspect of the structure is incredibly difficult. — An excerpt from the original memo showing the detail of the aspirations. (Source: the book *One Strategy,* materials)

Agile Execution

No topic caused me more personal grief and angst than the phrase agile execution. The concept of agile execution—seemingly a religion with terms like scrum, sprints, and stand-ups, as well as development process approaches that put experimentation on customers above all other methods—was top of mind on the Services teams. The team believed that the only way of addressing the poor results they were seeing was to move faster and become more agile. And while they were focused on this new methodology, the Vista team was gummed up unable to do things but somewhat irrationally believed that their problem was not taking enough time to get things done. The key leaders in Windows believed the problem with Longhorn was that the team was not given enough time, enough time to complete WinFS, Avalon, or other key initiatives.

The Services view was consistently expressed as “delivering internet services is entirely different than releasing boxed software.” The use of “boxed” was always meant as an insult specifically aimed at me, even if occasionally said with a neutral tone. The implication was old, easy, and irrelevant. A key aspect I was informed about repeatedly was: “Services do not use the waterfall approach, but rather they must iterate in the market.” Waterfall was another code word for old and dumb.

The problem with describing Office (or Windows) as waterfall was (and is) that this presumed a development process of writing specification and handing off to development and then later to testing—a sequence of discrete steps known as a waterfall. Implied was that there was never any notion of reevaluating what was going on, iteration, or that no work was done in parallel. Also implied was a perceived timeline of years. This was not how Office worked, but there was no chance I would change the minds of those arguing for agile. Whatever Office did, it was not agile, and the proof was that a product took 24 to 36 months. Still, Office iterated throughout the milestone process and also updated the product with hundreds of changes every month, and those were based on data of how the product was being used in the real world. But that was only evidence of maintenance not innovation, even though the vast majority of Services updates were simply to keep the services running and not new features—roughly equivalent in my book. There was little evidence of innovation in Services to counter this example.

A classic waterfall diagram as a series of boxes fro the upper left to lower right with arrows between them: System requirements, software requirements, analysis, program design, coding, testing, operations. — The classic waterfall diagram from the original paper by Dr. Royce. (Source: Proceedings of IEEE WESCON (1970) *Managing the Development of Large Software Systems*)

As an aside, the concept of waterfall development has been misunderstood for generations. The first description of the process came from Dr. Winston Royce and appeared in the Proceedings of IEEE WESCON from 1970 in an article "Managing the Development of Large Software Systems.” Royce diagramed out what became the canonical waterfall process of gathering requirements, analysis, program design, coding, testing, and so forth. One discrete step after another. Royce, however unfortunate for us all, meant for that diagram to be what not to do. In the full text, he explained how critical it was to iterate at each step to be successful.1 Yet because of that diagram generations of engineers treated the process as a stepwise and discrete, one after another. Also, maybe IBM was to blame too. Nevertheless, many on the Services and Windows teams perceived the Office planning process, including a vision document and milestones, as a waterfall approach in the classic and incorrect manner.

What I had learned as I gathered information ahead of writing my memo was that our use of these new agile methods was causing multiple real execution challenges. One example was the Spaces service, which was poorly architected for scale while racing to put features into market to compete with MySpace. In fact, they were asking me (the new person) for budget approval for a lot more data center spend because the costs per user on the free service continued to rise significantly and non-linearly—that is, each new user being added to Spaces was costing more than previous users. Clearly, that was unsustainable.

The most shocking example of management by self-induced crisis was Internet Explorer. In many ways IE was the symbol of a rapidly developed product, participating in the creation of “Internet Time” as it competed with Netscape 1995-1999. Once the original Longhorn plan started in 2000 to 2001, development on Internet Explorer was, for all practical purposes, shut down. In my first days on the job, I met with the recently reconstituted IE team who had been given a “hurry up and get a release done for Vista mission.” A recognition that Vista needed a browser as part of the Longhorn Reset.

Internet Explorer 6 shipped with Windows XP in August of 2001. Here we were, almost five years later, and while there was a great deal of activity in terms of security fixes and supporting the myriad of Windows releases, nothing substantial in terms of product features was released. Ironically, perhaps, the intention in 2003 was to stop releasing standalone browsers to focus on integrating and synchronizing the browser with Windows. IE had effectively ceded the browser war to Firefox (Google’s Chrome was two years away, but there were rumors of its development). IE was riddled with security holes due to the use of Microsoft’s ActiveX technology and components which were not architected for the modern security landscape, while also falling behind on performance and web standards. IE had become a pariah on the internet and attracted scorn from the developer community. Realizing this, a crisis was created by the very people who ended development and essentially commanded an updated browser in time for Windows Vista. This amounted to a classic arsonist-firefighter dynamic within a culture that always seemed to love a good crisis.

The good news, if there was any, was that Dean Hachamovitch (DHach, former Word and Office PM of AutoCorrect fame) was leading the new team, reconstituted from across the company. In the first meeting I had with the team, all the managers fit into one small conference room. The team was woefully understaffed for the work it needed to do. This was nine months before the product was to be completed. Dean was already exhausted, but we found ourselves allies. In talking about IE with BillG, SteveB, and KevinJo, the irony was not lost—voluntarily ending work on browsing after a fairly well-known legal battle was an odd choice, to say the least, and one I did not spend time trying to understand.

There was also the idea that planning and being thoughtful were archaic and the modern way of building products was via lean methods, as they became known—get an experiment or something “minimally viable” to market and then iterate. Though by this time, the biggest successes of these agile methods had also mostly imploded as companies. On the other hand, most people were consistently surprised at how long even relatively thin-featured products gestated before becoming viable and then successful. Managing every product like it might be the next Google search made no more sense than managing every product like it was a NASA mission. There was a rational approach in between, especially for products that were mature or necessarily focused on enterprise customers.

This issue as I would later discover was one that none of my new management receiving this memo would quite know what to make of. The conversation would come back to needing a plan and me returning to the reality that the team hadn’t ever developed and executed on a plan. That meant there was more needed than a slide deck with a set of features and a schedule, all while needing to find a way to agree to the highest level of development methodology.

It also meant that the odd-even curse around Windows might have been due at least in part to a lack of patience, and that regardless of my plan the team might not have the patience to see it through.

While on a recruiting trip I caught up with Sarah Leary (SarahL), the product manager who represented Office at the Windows 95 launch event. Sarah invited me to attend her class at Harvard Business School in 1998. Professor Marco Iansiti taught his classic case study on the development of Microsoft Word2—the one where the Word team was called the worst development team at Microsoft by my future manager JeffH. Marco and I got talking and he invited me to spend the fall of 1998 helping to teach that very class with his colleague Stefan Thomke, which proved to be an incredible learning experience for me.

Marco later helped us to articulate our aspiration for agility as defined by Agility = Execution + Impact. This was a way to talk about three challenges all at once without having to define what agility really meant or even that it meant fast or, worse, faster. By focusing on execution, I was able make clear the issues of simply failing to get things done, like MSN Messenger working correctly for people with more than one PC or big features of Vista that were cut. With the addition of impact, I would discuss the issues of the Services team spinning their wheels while not making any strategic progress. This definition also helped me to avoid picking a specific development methodology which was about as appealing as choosing to become ISO 9000 certified (that’s a joke a few people will get.) Instead, we would focus on planning, plans, and timelines using my favorite methodology of “promise and deliver.”

To put a time scale on agility in my memo, I pointed back to ChrisP’s Shipping Software mantra (Chris was my manager in Office and had joined Microsoft to work on Mouse 1.0, Windows 1.0, Word 1.0, and also led Excel engineering, then Word, before leading Office.) I said we would aspire to a milestone-driven process, with more than one milestone, and a process to plan, execute, reevaluate, and iterate. I had a great deal of difficulty bridging my experience in product cycles lasting years with the perception of needing to last days, no matter how much we talked about how processes can scale while the absence of a process is still a process, known as chaos.

The Windows and Windows Live Planning process. It is a more modern version of the waterfall diagram attempting to incorporate iteration. — The response to “waterfall” and “boxed software” was a digram trying to show all the iterative steps in the process. Still people looked at one-way arrows and boxes and thought “this takes years.” I was never quite able to capture all the iteration at each step, especially within the milestones. A later version of this had a lot of cycle arrows around the milestones and an explicit “this could be weeks or months.” (Source: used many places including *One Strategy*)

Across all teams there existed a cacophony of agile development that was defined as a cultural high-order bit. In many years of working with teams as they moved into Office or were aligned with Office, my experience was that there was some degree of correlation between teams executing poorly and a very specific development and engineering process that the team was overly proud of developing. Such a process was one the team pioneered and was deeply committed to, even to the exclusion of success. This wasn’t a causal relationship, but rather a correlation often seen. Certainly, teams with a unique methodology also executed super well, but that probably wasn’t causal though they believed it was.

The challenge, or properly my baggage (Office worked differently than Windows), was much more acute when speaking with the Windows teams. Windows felt they were not moving fast enough, but after six years of Vista the more general view was that they were not given enough time. This was rooted in the way that Windows NT was developed, with architects and a lot of upfront design (in practice, much closer to a historic waterfall approach). The project, which started approximately in 1989, was not ready for mainstream usage until almost 2000. And to many on the team, a decade was the expected amount of time needed to build robust platforms at scale.

The Windows team had a belief that Office shipped releases mostly on time by “cheating” because it cut features from the product before it shipped. PaulMa (pioneering manager of Windows, including NT, and former CEO of VMWare) often told me, “You just can’t cut things from Windows like you can in Office.” In any discussions about Office processes, I always felt a bit of OS snobbery directed at me. While this could have been my own inferiority complex, there always seemed to be that unsaid feeling that Office was the simpler product.

For what it’s worth, back in the day, Office people always thought that Windows was a perfectly good product to have in order to launch Excel and Word, but not much else. This was reinforced externally because Office on the Mac operating system was equally loved. We had our own expression of snobbery in Office.

The two gardens continued to exist.

There was another truth that emerged as I researched and tried to sell my plan. There was overall perception of Office, and by consequence me: aside from cheating by cutting features, I was confronted with the perception that I was a tyrant, literally. The reason Office shipped on time and was so structured was because of how I ruled the team—by terror or by some sort of iron-clad grip on process. The more I talked to people the more I learned of what I thought were crazy stories about how the teams worked, and how I worked—hearing them was like learning about some exotic culture across a vast ocean, not just another part of Microsoft. Not my Microsoft.

It was the first time I had to face the perceptions people had of me personally but also had to reconcile how those perceptions could be so opposite of my reality. At times the disconnect between perception and self-awareness had me questioning my sanity. I understood how I could be intimidating just as any executive could be, but at the same time I felt the team knew I was fully supporting them and worked hard to avoid the trappings that contributed to that perspective.

It was, after all, Windows where the manager punched his fist through a wall. It was Windows where people (including me) were regularly chastised in front of big war-room meetings. It was Windows where managers often found out about changes to their schedule or plans via rumors or indirectly. I did none of those things. I didn’t yell. I didn’t skip over managers. I didn’t escalate decisions (or tolerate escalation). Generally, my biggest offense was writing lengthy emails late at night with too many points in them, and sure, an occasional barb, though I rarely replied-all and did my very best to focus on ideas and products, not individuals in emails. That and refusing to go to endless meetings, especially early in the morning or when they were scheduled at the last minute were where I regularly messed up. Besides, how could anyone hold a crisis meeting for a strategic discussion?

Whatever my flaws as a manager, what I thought was going on was that the Windows team was looking at how they worked and assuming that to achieve the results that Office achieved it must be doing what Windows did but just more, or better. So, more escalation, more big meetings, more VP decisions along with better PowerPoints and shorter lists of non-goals (or is that longer?)

That wasn’t reality. But it was the reality I had to deal with, as out of body as it felt.

In hindsight I began to realize that the two gardens were not styles but deeply held beliefs. Each of Windows and Office operated the way they operated, and loved it. Each had achieved tremendous success in the market. Where I thought Windows achieved success in spite of how they operated, they saw their success as because of it, and vice versa. It just so happened that the most visible cultural differences were also almost the opposites: planning versus crisis, consensus versus singular leader, cult of team versus cult of leader, promise and deliver versus over-promise and deliver, and on and on. Even today, it can be challenging to describe the gardens without sounding judgmental one way or another.

Top of mind during this transition were some of the more legendary efforts at cross-pollination between Windows and Office. Several extremely talented and senior people in Office had taken roles in Windows only to quickly return to Office sharing tales of their adventures. And while there were stories in the other direction, it often felt like we had more success with Windows people moving to Office. From the outset, I was deeply worried about that sort of rejection knowing I had nowhere to go back to.

Our aspiration, Agility = Execution + Impact.

Aspiration The aspiration for agile development is one that is easy to get carried away with. For example a big mistake would be for us to announce now that we are committed to yearly releases of Windows with bi-yearly significant updates, or quarterly updates of Services. We have no experience to indicate we could do this and given the amount of pre-booked work for Windows and bringing Live to real 1.0 quality we should not embark on this futile promise. The team would react negatively. Rather we need to focus on creating milestones we can successfully meet as a team and delivering on those commitments. And those milestones should be defined in customer facing terms that measure progress, not development activity. We will strive to be an organization that defines agility as Agility = Execution + Impact. We are going to move the needle when we release software. We will not just look to our performance reviews or a few bloggers to define success. — An excerpt from the original memo showing the detail of the aspirations. (Source: the book *One Strategy,* materials)

Discipline Excellence

Despite having thousands of engineers with more seniority (as defined by salary level) than any other engineering team in the company, the team did not have the depth or breadth of talent (human capital) to build products of the scale being attempted. Sharing this observation was scary. It was both counterintuitive in thinking and felt like the height of arrogance. To BillG who valued IQ above all and prided himself personally on the IQ assembled to build Windows, to express this was, for lack of a better word, insulting.

I used data to explain it.

Something PeteH once explained to me: “You can’t build a billion-dollar business out of 10 products each doing $100 million dollars.” What he meant was that the characteristics of a billion-dollar business are different than a collection of smaller businesses (he was referring to the struggles Microsoft was having in the Microsoft Home division). That pertained to my challenge at hand: we couldn’t create a product team at scale for billions in revenue with 100 teams of 25 people each. A 2,500-person product team operating in unison was qualitatively different than all those small teams added together, even if headcount was the same. Even worse, it was almost always the case that the bulk of the value delivered was due to a small number of those teams anyway, leaving most of the teams work essentially unaccountable or even squandered.

Windows was sold and experienced as one product, but it was organized as though 100 small teams came together to create that product while operating essentially independently. What was supposed to make Windows be Windows was how all the pieces fit together. There was no organization, however, to build that product. Simply put, the whole was not greater than the sum of the parts.

The driving force behind all the small teams was to empower people to work outside the complexities of the bigger team. The team had found itself caught in a negative reinforcing cycle. It was too difficult to get things done because processes were failing, which caused management to assign senior leaders to work out of band or off the books to get truly important things done (translation: make it a crisis), which made it harder to integrate those into the product and amplifying the overall difficulty of shipping the whole of Windows. The empowerment led to poorly integrated and architected products such as Media Center and Tablet PC, as well as disconnected core architectures such as DirectX graphics, Networking, and Security. The success of early Internet Explorer working this way reinforced this as a methodology, but all that came to a standstill once the goal of the product was to integrate it with a whole.

That would be challenge enough, but accelerating this cycle was the existing approach to managing. In order to conjure up these small, agile teams, management pulled people from the ranks and gave them responsibility for managing a team of developers, testers, and program managers—creating product unit managers, PUMs, or multidisciplinary managers, MDMs (the HR expression). PUMs were a direct manifestation of the old list of people and problems, formalized to an org structure.

For a culture that loved a good crisis, the heroics of being a PUM managing a crisis became an aspirational job. As I was making the rounds talking with middle managers before writing the memo, a frequent topic raised was the desire to become a PUM, and my view of the career path to become a PUM in my new world. When speaking with PUMs, I heard time and time again, “I work best overseeing a small, multidisciplinary team.” The problem was the lack of supporting evidence proving that point. Being a PUM was a career goal for nearly every mid-level engineer, not being a great engineer.

A direct result of pulling people from the ranks and promoting them to manage multidisciplinary teams was to cut off the pipeline of talented engineers and, more specifically, program managers. The very people who would be called upon to scale and manage larger teams of engineering leaders were robbed of the depth of discipline expertise that would train them to do so.

As if this wasn’t enough, these new leaders were then responsible for hiring, mentoring, and growing the next generation of leaders in job functions they had not even done at any level of seniority or tenure. As a result, most of these teams had a management structure where the PUM was also filling the role as development manager or group program manager (the titles for the role of leading the job function). This further stunted the development of new leaders.

This is a table of numbers that I can't represent in the alt text effectively. I've described the in the caption. — The detail of headcount for the core engineering disciplines by both level and management. Yes/No divides the table into managers or not. Observations: There are only 8 senior (partner) development managers for almost 1000 engineers, 18 if counting level 67 and only 2 or 5 program managers and 0 test. While there are a large number of senior individuals in development, most of them are concentrated in a few groups (not shown). At the same time, 20 of the senior Level 68+ are managers of managers. (Source: personal, *One Strategy*)

To illustrate this point, I compiled the statistics of the approximately 40 product units in the Windows and Services group (not including COSD, but the numbers matched almost identically). It revealed that half of the product units were being led by people who would not have been senior enough to be discipline leaders (dev managers, test managers, or group program managers) in Office.

The lack of seniority was immediately recognizable in program management, arguably the most crucial role for achieving synergy in product design and features across a single product. Overall though, Windows had more senior employees than Office, but they were allocated to pure management roles, PUMs.

The quest for PUMs and autonomy had pushed all the relatively senior talent to be managers of managers (or their managers). That was a shocking realization. This was also a generational problem because the presence of PUMs robbed the junior engineers of opportunity. The Windows team had been robbed of a generation of talent development.

Perhaps nothing was more shocking than the Software Test discipline, where, once again, I was up against a long-held belief, by BillG and SteveB in particular, that having testers was not a sign of success but somehow represented a failure of tools and processes in engineering. For many years, I tried to have this debate or discussion but simply ran out of ways to sound anything but defensive. But in truth: there was no engineering or manufacturing, in any field, without the role of quality assurance, and the more complex a product the more testing it needed. Software projects brought with them two unique characteristics not seen in hardware or manufacturing. First, Windows for the most part provided programming interfaces to developers who would do all sorts of things, some expected but most not. Testers came to work and found ways to exercise APIs by writing adversarial code against them. Second, every release of Windows shipped supporting every previous release and previous capability on all the hardware that existed and all the hardware that would exist. Of course, Windows had enormous libraries of automated tests and more being added all the time, but all they could do was tell you that you had not broken something that already worked the way you thought it should work. There is much more to testing. I understood that start-ups and smaller projects could do without testing, as Microsoft had in the early days, but complexity, extensibility, and backward compatibility caught up to every product.

Later, when I made my case after sending the memo, I experienced a lot of friction on the topic of testing because SteveB as CEO had been pushing teams to reduce headcount as a cost-saving measure. Both Services and Windows had reduced headcount by reducing testing and moving responsibility offshore or to vendors. The Services team, where there would normally be one tester for every software developer, had half as many. As we learned in operating internet services in Office, testing wasn’t reduced but rather shifted to and shared with operations, which was also understaffed.

Tactically, our plan was to aim for two important structural changes. First, we would dramatically reduce the height or depth of the organization. This was something that SteveB would get excited about as he had been trying to get people to understand Jack Welch’s General Electric approach to org span of control and depth (at this time, everything Jack Welch said was undisputed business canon). SteveB had run up against PUMs and the depth and minimal span of control that model imposed on an organization. This would dramatically alter the jobs and career paths of dozens of the most senior people on the team. It would be a very expensive change to undergo.

Second, I proposed reducing the number of pure managers, those with no line of responsibility but who had management oversight. They did not write code, specs, or tests but focused on the process. Some were needed, but the organization had too many, which contrasted with Office where even the most senior discipline leaders were managing people and writing code and/or fixing bugs.

At the strategic level I used this memo to begin what I knew would be the most important management journey of my career: restructuring the Windows and Services team into a functional or discipline-led organization.

A reality I could count on was that it seemed nothing could have messed up the Windows business, and hence all of Microsoft’s revenue and profit, at that point. In hindsight, what Windows had was the greatest product-market fit in the 20th century, except maybe for oil. That stability enabled the company to thrive during macro issues of recessions and wars. It thrived throughout the largest antitrust trial in our lifetimes. It thrived through successive changes in leadership and company reorganizations. It thrived through a restructuring of the PC manufacturing industry. More than anything, it thrived despite products receiving lukewarm reviews at best and a lot of releases being broadly panned, and nearly every single product being released to market years later than planned with notable quality issues.

Windows had no trouble surviving the odd-even nature of flawed products and changing leaders. To date, there had been no credible competitors or alternatives.

As I write this today, I realize just how wild that sounds. It was, however, true.

In one exercise, my colleague Adrianna Burrows at our communications firm WaggEd researched key product reviews for all the Windows releases going back to Windows 3.0. Surprisingly, out of that selection, while some were glowing (Windows 3.0 and Windows 95) most were lukewarm to good (Windows 3.1/3.11 and Windows XP), and many were quite painful to read (Windows 98 and Windows Me). Windows Vista was shaping up for reviews akin to the latter. Looking back on the reviews solidified my opinion that there was much more of a Windows challenge than a Vista-only challenge. The business model and momentum were sustaining the product, not the march of consistently improving products and increasing customer satisfaction. At one point, I even suggested to SteveB that Microsoft would have been fine not shipping several of the Windows releases. Heresy. To be fair, in Hardcore Software I have pointed out that absent the contractual obligations and staggered adoption of Office, it is not entirely clear the same would not be said of Windows.

While I did not have the vocabulary of product-market fit, I knew that I had the luxury of being patient and deliberate. SteveB showed restraint, even though every bone in his body wanted something fast. I was not going to rush. I was not going to have a short-term tactical plan to show we were awake or listening—something that had been suggested many times by those more senior than me and in subsidiaries. I knew we would spend a lot of time in push-pull conversations, but ultimately, I believed I had the support to do what I thought needed to get done.

The goal was to have the whole organization collectively, including COSD, deliver one Windows product to customers, OEMs, and enterprise/business customers. The cardinal rule of having everyone finish at the same time was to have everyone start at the same time. But with the Windows team still finishing and also about to undergo a major organization change, I needed to develop a hybrid approach. This would also remove some of the pressure at the company level to show progress or, worse, to make sure we did not look like a few thousand engineers were going into hiding.

In this transition memo, sent to BillG, SteveB, and KevinJo (which I sent when Vista was still six months from shipping), I proposed an entirely new organization and a rationale for why we were going to operate together as one.

On to 087. Reorg! Why Are We Together, Exactly?

Royce, Winston (1970), "Managing the Development of Large Software Systems" (PDF), Proceedings of IEEE WESCON, 26 (August): 1–9 http://www-scf.usc.edu/~csci201/lectures/Lecture11/royce1970.pdf

https://hbsp.harvard.edu/product/691033-PDF-ENG

Hardcore Software by Steven Sinofsky

086. The Memo (Part 2)

Agility = Execution + Impact — one of several one-liners I would employ in the process of articulating how the team could improve

Decision-Making

Agile Execution

Discipline Excellence