Paths to Power: What can Architects Learn from Sources of America’s Global Dominance

In discussing American exceptionalism and influence in the modern world, political scientists and academics talk about various sources of American power, for example US military might or its economic dominance.  Walter Russell Mead, an academic and a foreign policy analyst, defines four kinds of American power: the sharp power of the US military that provides the foundation, the soft power of American pop culture and the sticky power of an American-led economic world order that build atop the foundation, and finally at the very top the hegemonic power of the US, a prerogative and privilege to provide credence and confidence to all things worldly, whether they be economic coalitions or other global pursuits.  Organizational theorists and researchers have done extensive study of power and influence of individuals and groups within organizations, however the political scientists’ lens provides an interesting framework to understand how power should be built and exercised.  An example of such an application is looking at how the architecture function wields and exercises power with an IT organization.

Architects typically exercise their power and exert influence through formal governance processes and mechanisms – the infamous “architecture review board” is one of the various architecture forums that have given many a sleepless night to IT program managers.  Architecture governance can be characterized as the architects’ sharp power that is exercised by poking and prodding the IT teams to behave in a certain manner.  Just as a nation’s sharp military power serves as an important foundation for its power structure, so too does the architecture governance framework for the architecture group.  But, a foundation is not enough to create and sustain power.  US’s overwhelming military power is a powerful deterrent for many, but it is far from enough to ensure American dominance in world affairs.  And this has been demonstrated time and again by many IT organizations where in spite of having strong governance processes, the architecture teams fails to have appropriate influence in decision-making.  What is needed to complement the sharp power of governance are the architecture team’s sticky and soft powers.

America’s sticky power rests on two foundations: a dollar based global monetary system and free trade.  A dollar-based system not only allows America to reap direct economic benefits e.g., seigniorage from issuing bank notes to lenders all over the world, but it also allows America to exert the influence of its monetary and fiscal policies far and wide.  Globalization and fee trade, championed by American based multinationals, has created an economic framework in which benefits accrue to all countries that participate.  There are strong incentives for a country to participate in such an economic system, and strong disincentives to leave.  Architects have an opportunity to wield sticky power by creating a system in which their work creates benefits for their key stakeholders – architects should reorient their work to facilitate the outcomes and key decision-making IT management would like to have but cannot due to political, organizational or technical reasons.  Be it brokering talks, striking alliances, or negotiating trade-offs, architects in this role will need to function as first-class deal makers.  And in this role, architects will be seen belonging to an elite group that knows how to get things done. In an ideal state, IT managers will have no choice but to engage with the architects, because of the assurance that such an engagement will provide for the achievement of the outcomes they desire.  Achieving initial success will create a self-reinforcing system in which stakeholders will be increasingly willing to work with the architecture team and increasingly hesitant to work against them.  Sticky power, however, needs an inducing agent, something that will draw others in.  That is where soft power comes in.

American soft power emanates from America’s ability to successfully export its culture, values, ideals, and ideas to the rest of the world.  Architecture’s ability to successfully project its image across the entire organization is critical to garnering soft power.  The typical perception of architecture work being just technically focused is a result of architecture’s team failure to project an image commensurate with the stature of true architecture work.  Architects need to build a culture, set of ideas, principles and work habits that are unique and suited to the problem-solving, deal-making, and relationship and influence building that architecture work demands.  It starts with hiring the best and the brightest – not just the technically savvy, but true leaders, strategic thinkers and doers.  But creating such a culture is not enough – it needs to be packaged and “exported” to the rest of the organization, with the goal of creating an environment where the rest actually want to go join the architecture team because it has a certain cool factor.  On this scale, the typical architecture team fails miserably – one has yet to come across an IT organization where he architecture team is considered one of the cool teams to join.

Hegemonic power, the holy grail on the road to building power and influence, is power that comes from powers of all other kinds – it is the cherry on the top of the icing that allows America to set the agenda and rules for world affairs, or be the “gyroscope of the world”.  The architecture team is in a prime position to be the gyroscope of the IT organization.  By combining the power of governance with the ability to create real value for their stakeholders and attracting the best talent, the architecture team can influence decision-making at the highest levels – it can set the agenda to facilitate its own goals and outcomes and thus perpetuate its power and influence.   The nature of architecture work is such that sustaining power and influence are crucial to ensuring long term success of architects.  Maintaining power on an ongoing basis however takes effort, wise decision-making, and moving with the times – just witness the case of Britain, which only around a hundred years ago, was the world’s leading power by far and wide, but which was gradually supplanted by America as it held onto its stubborn positions and made fatal policy mistakes.

Testing in a Continuous Delivery Environment

In his book Out of the Crisis, W. Edwards Deming cautioned “Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place”. Ever since, ‘Building Quality In’ has become one of the central tenets of quality focused lean initiatives, including lean software development. The act of testing in software development is an example of inspection: inspection to find bugs and faults in the developed software; static code analysis is another example of inspection. Quality is important in the context of software development because software bugs cost both users and software providers dearly: a study conducted on behalf of the US National Institute of Standards and Technology estimated the cost of software bugs to the US economy to be around $60 billion. Perhaps the extent of this scourge is not surprising since in many organizations software testing is not effective: testing/QA teams run “quality gates” as an afterthought and even then testing does not necessarily translate into quality. When Agile came around, practitioners came up with new approaches to testing, aptly described under the banner of “Agile Testing”, that provided some improvement by driving more collaboration across teams and bringing testing up in the development cycle. Now with the advent of DevOps, testing specifically has taken on a new level of significance since continuous delivery is not just about delivering software rapidly, but software that works as well. A few have even coined a term for this new discipline: continuous testing. All that is well, but what does testing mean in a continuous integration/delivery environment?

blue devops_4

In a continuous delivery (CD) environment, quality becomes the responsibility of all. This does not mean that the QA and testing teams do not have a role to play in a CD environment. On the contrary, the QA and testing function moves into a strategic role, providing oversight, direction and leadership for diving overall quality. For example, instead of spending countless hours running manual tests, QA teams will invest resources and time to develop and implement a comprehensive test automation strategy, or they will spend effort putting in place governance processes, metrics and incentives to drive quality at every step. An example of how quality becomes everybody’s responsibility is what the development staff would do in such an environment. Development teams in a CD environment are empowered to take on quite a bit of testing upon themselves. In addition to a ‘test first’ approach, developers may also be required run pre commit testing that runs a suite of unit, component and integration tests. Indeed many CI servers provide the capability for ‘private builds’, which allows an individual developer to see if their code changes can be integrated into the main trunk for a successful build. Pre commit testing should enable developers to conduct a quick ‘smoke test’ to ensure that their work will not break the code in the main trunk. Therefore, pre commit testing may contain a selection of integration and acceptance tests. Once the developer checks in the code to the CI server after pre commit testing, the CI server runs the commit stage tests, which includes performing any static code analysis as required, component and integration testing, followed by system testing. Commit stage testing results are immediately fed back to the development team to get any errors or bugs addressed. Successful commit stage testing increases confidence in the build’s ability to be a candidate for acceptance testing. Builds failing commit stage testing do not progress to the next stage: the acceptance testing stage.

Acceptance testing is the domain of business analysts and business representatives assigned to the project team. However, this should not mean that development staff do not have any involvement in acceptance testing. Successful testing in a CD environment gives developers more ownership in driving quality by allowing them to conduct automated acceptance tests in their development environments. Common obstacles to enabling this, such as insufficient licenses and/or manual deployment and setup processes, need to be removed. Acceptance testing is a critical step in the deployment pipeline: a release is deemed acceptable for deployment only if it passes the acceptance test stage. The entire team should focus on fixing acceptance testing issues for a given release. A CD environment requires acceptance testing to be automated as much as possible: a fully automated acceptance testing suite enables the tests to be run for a build as when needed – this speeds up the development process and also enables creation of a powerful suite of regression tests that can be run over and over again. Some tools even offer capabilities to encode acceptance test criteria and to programmatically drive creation of acceptance testing based on those criteria: thus testing, and hence ultimately delivered software, can never be out of sync with evolving acceptance criteria and requirements.

If the system under development is a high performance system, some capacity and performance testing may become part of acceptance testing as well. Usually however capacity testing and testing for other ‘non-functional requirements’ is separate stage in a CD deployment pipeline. Although a CD environment requires such tests to be as automated as possible e.g. through the use of Recorded Interaction Templates and other devices, the success criteria for such tests is somewhat subjective – so although a release may fail automated capacity testing technically, it may still be greenlighted to go ahead based on human judgment. Ultimately, as the release completes the non-functional testing stage gate, it may then be put through more of the traditional manual testing. This is where human testers can excel and apply their expertise in UI testing, exploratory testing, and in creating unique testing conditions that automated testing may have not tested the software for. Manual testing effort thus is one of the last stages in the testing pipeline in a CD environment.

If testing is to indeed become ‘continuous’ in nature, there are several critical factors that need to be in place. Perhaps the most critical one is test automation, which is often times slammed by some practitioners to be difficult to do or non-value added. Whatever the reservations with automation, testing in a CD environment cannot possibly be efficient and effective without automation of testing – especially since testing is done in big numbers and it is done quite often. Automation is just one piece of the various test design and execution strategies to make testing execute efficiently and thus be successful in a CD environment. For example, CD practitioners recommend a commit testing stage lasting no more than 10 minutes – a hurdle that can be met only by adopting such strategies. Automation also applies to provisioning of and deployment to environments. ‘Push button’ deployments and provisioning of test environments is critical if developers are to conduct smoke acceptance tests to quickly test their work. Similarly, test data needs to be managed effectively. Test design and test isolation need to be such that data requirements for testing purposes are fit for purpose and parsimonious : wholesale replication of production data is neither feasible nor recommended in a CD environment. Data management, like environment management, needs to be fully automated with configurable design and push button techniques.

Testing has the opportunity to move from being a reactive purely static analysis function to being a proactive quality focused initiative. Achieving this requires making a number of tough decisions related to processes, division of responsibilities and organization of the testing effort. Making these tough decisions in a traditional environment is many times a choice. Moving to a CD environment however mandates those decisions to be made, which should be reason enough for organizations to start examining today how they can evolve and improve their testing efforts toward that ultimate model.

Building in a Continuous Integration Environment

Continuous Integration (CI), which spans practices, processes and tools to drive continuous development and integration of software code, is a key building block of an organization’s DevOps methodology. An important CI component is the software build process.  The software build process traditionally has been a second-class citizen of the software development world, relegated to the background  as organizations spend limited resources on customer facing and project management functions.  Software development and delivery is inherently fragile, but one of the most fragile parts is the software build process because development managers have traditionally lacked clear visibility and control of the build process.  Too often, software builds break easily, are difficult to change, and resource intensive to troubleshoot.  With increasing pace of business change and higher delivery pressures, however, every link in the software development and delivery chain will need to be streamlined, and one of the key areas organizations will need to focus on as part of their CI journey is the software build process.

blue devops_4Building is the process of compiling raw software code, assembling and linking various program components, loading of external/third party libraries, testing to ensure that build has executed successfully, and packaging the code into a deployable package.  While this may seem simple and straightforward, building in a big program is complex enough that an overall build architecture is usually required to be defined first along with a dedicated team and infrastructure for ongoing management.  Building usually happens at multiple levels for different consumers: development builds focused on single component development and testing, integration builds across multiple components, system builds for all the system components, and release builds focusing on building customer releases.  Build architecture deals with making specific choices around what to build, how often to build, and how to build it.  Inefficient builds that take too long to finish, challenges with isolating shipped product bugs and issues and replicating them in the development environment, or challenges integrating multiple code streams into release builds efficiently are all symptoms of an inefficient build architecture.  Architecting the build process starts with identifying what to build in a software configuration management system.  Code branching and merging, dependency management, management of environment configuration and property files, and versioning of third party libraries, are all critical components of software configuration management that are closely tied to identifying what software components are needed in a build.  “How often to build” involves defining the build schedules for development, integration, system and release builds, depending upon a number of factors such as number of development work streams, the frequency of releases required, and the capacity of the project team to name a few.  The build schedule clearly identifies how the builds are promoted through the development pipeline, through successive stages of development, testing, quality assurance and deployment.  Last but not the least, having appropriate build process tooling and infrastructure allows building to be declaratively controlled, automated, and efficiently executed.  Build script automation, build parallelization and tracking and reporting of build metrics are all usually managed by a modern build management platform.

With so many moving parts to the build process, it is easy to see where and how things can go haywire.  While there may be multiple factors contributing to inefficient and slow software development and delivery, ineffective building almost always is one of the contributing factors.  Traditional build management suffers from a number of usual-suspect issues, and most of them are process issues, not tool issues.  One of the most common approaches to building has been the big-bang style of executing integration builds, where independent development work streams bring together work stream codebases together as part of an infrequent big bang event.  Compounding the problem is the tendency of development teams to throw code over-the-wall to the build team which is then painstakingly tasked with assembling sources, data, and other inputs to begin the building process.  The big-bang integration brings a big-bang of integration issues and broken builds, a.k.a. “integration hell” in the industry parlance.   Cultural and management issues play into this as well: build teams are not empowered to exercise and implement build discipline with development teams, which in turn do not always action on feedback from build team on broken builds in a timely manner, leading to lengthened cycle time for builds.  Lack of exhaustive pre-commit testing in the development phase, either because development teams are not incentivized to unit test or because effective unit testing harnesses are not available in the first place, leads to “bad commits” and downstream integration issues, putting pressure on the build team.  Many build issues can be traced to just poor build process management.  For example, complex source code branching and componentization schemes complicate the build tasks and make the build process error-prone.  Management of dependencies and build configuration frequently lacks sophistication which leads to challenges in implementing incremental builds, leading to build issues. Inadequate build automation and infrastructure can lead to a host of issues as well.  For example, manual environment setup and data management complicate the build tasks, making building error-prone and lengthening cycle times.  Build infrastructure often times is not adequate to execute complex from-the-scratch builds, which can take hours to complete, thus lengthening build cycle times.

Build management as part of CI aims to get around these challenges to streamline and turbo charge the build process, and ultimately improve the development process overall.  And it begins by fundamentally changing the traditional development and delivery mindset.  Whereas the usual approach involves “software craftsmen” working independently to create perfect fully functional modules that are then integrated over time, building in a CI environment espouses a much more agile approach in which team members come together to develop base product functionality as quickly as possible, incrementally building to deliver the full product over time.  Cultural change to drive close integration between development, testing/QA and build teams is key: in a continuous building environment, development team works hand in hand with the build and testing teams, and a “buildmeister” has the authority to direct development teams to ensure successful build outcomes.  Successful continuous building starts with overhauling the development effort, enacting such practices as test-driven development, and other precepts of Extreme Programming such as feedback loops that encourage development team to take ownership of ensuring successful testing and building. Development teams check-in and build often, sometimes frequently in a day.  And they follow strict practices around check-in, bug fixing and addressing broken builds.  A continuous building environment is characterized by the presence of a “build pipeline” – a conceptual structure which holds a series of builds, spanning the life cycle of the build development process, beginning with a developer’s private build all the way to a fully tested release build, each ready to be pulled by any team as needed.  To enable this, a CI server is used to automate and manage the build management process.  The CI server is a daemon process that continually monitors the source code repository for any updates and automatically processes builds to keep the build pipeline going.  The CI server allows builds to be pulled out of the pipeline by individual teams and for builds to be advanced through the pipeline as build promotion steps are successfully completed.  With each promotion step, the build becomes more robust and complete, and thus moves closer to being shipped to the customer.

To achieve success with continuous building, practitioners recommend a set of best practices across build management, processes and tools.  A few key ones are related to software configuration and environment management, specifically, that development be managed from one “global” source code branch (or reduce branching) and that all build components, including property and configuration files, be managed in a versioning system. Then there are process-related best practices, which deal with development teams following pre-commit testing and actioning on feedback for broken builds on a priority basis.  Automation is a key aspect of continuous building best practices as well: a CI server that manages builds and the build pipeline is a key component of the build automation infrastructure and is central to implementing continuous building.

Build management is a key part of CI, and one where a number of issues with traditional software development and delivery methodologies lie.  Achieving full CI, however, requires change to other key parts as well, for example, testing and deployment management.  In the future DevOps related posts, we will look at these other aspects of CI.

The DevOps Movement

The DevOps movement has been resurgent in the past few years as companies look to improve their delivery capabilities to meet rapidly shifting market needs and business priorities.  Many have been preaching how companies should actually become not just robust and agile, but in fact “anti fragile” with the ability to expect failures and adapt to them.  The likes of Google, Amazon and Netflix embody this agile and anti-fragile philosophy, and traditional business houses facing increasingly uncertain and competitive markets want to borrow a chapter from their books and become agile and anti-fragile as well, and DevOps is high on their list as a means to achieve that.

blue devops_4

DevOps is a loose constellation of philosophies, approaches, work practices, technologies and tactics to enable anti fragility in the development and delivery of software and business systems.  In the DevOps world, traditional software development and delivery with its craft and cottage industry approaches is turned on its head.  Software development is fraught with inherent risks and challenges, which DevOps confronts and embraces.  The concept seems exciting, a lot of companies are talking about it, some claim to do it, but nobody really understands how to do it!

The much available literature on DevOps talks about everything being continuous in the DevOps world: Continuous Integration, Continuous Delivery and Continuous Feedback.  Not only does this literature fail to address  how the concept translates into reality, but also it takes a overly simplistic view of the change involved: use Chef to automate your deployment, or use Jenkins continuous integration server to do “continuous integration”. To be fair, the concept of DevOps is still evolving.  However, much can be done to educate the common folk on the conceptual underpinnings of DevOps before jumping to the more mundane and mechanistic aspects.

DevOps is much more of a methodology, process and cultural change than anything else. The concept borrows heavily from existing manufacturing methodologies and practices such as Lean and Kanban and extends existing thinking around lean software development to the enterprise.  Whereas the traditional software development approach is based on a “push” model, DevOps focuses on building a continuous delivery pipeline in which things are “pulled” actively by different teams as required to keep the pipeline going at all times.   It takes the agile development and delivery methodologies such as Scrum and XP and extends them into operations so as to enable not just agile development, but agile delivery as well.  And it attempts to address the frequently cantankerous relationship between those traditionally separated groups of development and operations into a synergistic mutually supportive one. Even within the development sphere, DevOps aims to bring various players including development, testing & QA, and build management together by encouraging teams to take on responsibilities beyond their immediate role (e.g., development taking on more of testing) and empowering traditionally relegated roles to positions of influence (e.g., build manager taking developers to task for fixing broken builds).

We are still in early days with the DevOps movement, and until we witness real life references and case studies of how DevOps has been implemented end-to-end, learning about DevOps will be a bit of an academic exercise.  Having said that, some literature does come close to actually articulating what it means to put in practice such concepts as Continuous Delivery and Continuous Integration.  To the curious, I would recommend the Martin Fowler Signature Series of books on the two topics.  Although agonizingly technical, the two books do a good job of getting down to the brasstacks. My future posts on DevOps will be an attempt to synthesize some of the teachings from those books into management summaries.