A developer really just wants to live in the pull request world: a world where you can deploy your entire service from one environment, where everything is automated, works all the time, and you push every commit. If it fails in production, you could roll it all back. Imagine a world where you never have to leave GitHub. It’s where you do your change control, kick off all your workflows, have discussions, and store any other
config that you need for downstream processes. After you commit to the main branch, this stuff’s just off on a conveyor belt, never to be seen again unless it’s broken.
- AWS Lambda
- AWS AMI
A DevOps transformation for the whole product
As part of Autodesk’s DevOps transformation, we wanted to make sure we were thinking about the developer experience, and treating continuous integration and continuous delivery (CI/CD) as building for the whole product. How do I coordinate the development and deployment of a feature that touches source code, my desktop application, several backend cloud services, and my mobile application? How do I also ensure the product documentation, API documentation, and other artifacts deploy with the software? How do I ensure our international customers get access to localized content at the same time as the English-speaking customers?
Everything has to go out together. The ever-changing market also forces everything to deploy at an ever-increasing frequency.
This was the transition that we embarked on five years ago.
- Start with a collaborative environment.
- Give teams shared platforms and tooling.
- Default to innersource and open access.
- Embrace open source tooling and best practices.
- Balance standardization and customization.
- Define, measure, and track DevOps success metrics.
Step one: better collaboration with unified tools
“Thou shalt have a shared environment that's collaborative, discoverable, reusable.” That’s what they told us at the beginning.
It’s been a five-year journey for us to get to the environment we have now. We were very siloed when we started. Imagine 200 teams and 4,000 engineers, with everyone doing their own thing, their own way in the name of expediency. Only that it’s not that different from most enterprises out there; in fact, this scenario is all too common.
We started with a move to GitHub to get the collaboration we desired within engineering. Our guiding principle was that any employee had full access to all but a few of our 20,000 repositories. They were all visible, searchable, and reusable. We quickly needed to add Artifactory to provide similar capabilities for all our open source, third-party, and internally shared binary packages.
- 5,000 engineers across 200+ teams
- 20,000 repos and over 3M Git Commands per day
- 5.5M artifacts with over 3M downloads per day
- 10,000 builds a day
We wanted to deploy more frequently, and wanted to reduce duplication of efforts where everyone was building their own CI/CD pipelines and had their own CI/CD tools. To keep collaboration and unification at the forefront, we decided to build a unified set of CI/CD pipelines on top of shared tools.
Shared CI/CD for mobile, desktop, and the cloud
When we started looking for a replacement CI/CD orchestration tool, Jenkins was the obvious answer: it was open source, had lots of plugins, and worked with GitHub. If there was something like GitHub Actions four years ago, it might’ve been a different story. We wanted to have “pipelines-as-code,” something that was flexible and could be written, stored, and automatically kicked off in GitHub.
Our CI solution is used by just about everybody in engineering whether that be desktop, mobile, or cloud. We use Jenkins for orchestration, run security and functional static analysis as part of the unit test stage, then package and publish the deployable image to Artifactory. Now we’re ready for handoff to the CD portion.
- Deployment frequency
- Lead time
- Mean time to recover
- Change failure rate
But as native orchestration solutions from Azure, AWS, and Google Cloud become more powerful, we expect CI orchestration to diversify too. We’re thinking through a few solutions to this now—again, Actions is one solve for reuniting CI pipelines while letting the underlying technologies diverge.
When considering CD, the infrastructures and technology stacks for deploying in desktop, cloud, and mobile are, by necessity, quite different, but our goal is to hide those differences under similar-looking CD pipelines. Our focus has been on CD for cloud, and we now have a fully-automated reference CD pipeline that was based Jenkins and has now moved to Spinnaker. It works seamlessly with the CI pipeline and through GitOps. Desktop and mobile CD pipelines are destined to go in the same direction but are currently works in progress.
Shared CI/CD for all product artifacts
Where we’ve diverged for CI/CD is documentation. Previously, documentation was written in the DITA format using a third-party tool that had zero connection to the source code in GitHub. Now teams write their documentation in Markdown and store it in GitHub. Any commit kicks off a fully automated CI/CD pipeline that converts the .md file to HTML and posts the content to Autodesk’s external documentation portal.
The writers now get all the benefits that software engineers take for granted when using GitHub. Plus, documentation is now more of a shared team experience since everyone on our team knows how to write Markdown and process a pull request.
We’re currently adding a similar pipeline to automate CI/CD for API documentation. We embed the documentation into the source code using reStructuredText (RST) and Swagger and publish the API documentation out to an external portal. In this way, the API docs can be automatically updated at the same time as the software.
The way that we handle localization as part of this “whole product” CI/CD solution is markedly different from what other enterprise companies do. Our business is global, and all of our products are translated to over 12 different languages. We have a very sophisticated localization platform and pipeline, where any time you change your documentation or your localizable resources in your source code, we automatically translate that content into the required languages and push that localized content back into GitHub so that it can be built. This allows us to release new functionality simultaneously across all languages needed. We still use localization service providers to ensure the translations are of a professional standard, but as machine-translation improves, we become less dependent on these providers.
Putting it together as a ‘whole product’ pipeline
This is CI/CD for the whole product.
Here, everything kicks off at once. Engineers, writers, and localization works in sync, within the same pipeline. What started as a shift towards sharing code and best practices is now a unified, automated workflow for the whole product.
Defining DevOps success and developer happiness
We’re working to improve the developer experience in two ways: we’re helping teams become elite DevOps performers, and giving them the best coding environment possible.
Today, we measure our DevOps performance using DORA performance metrics. We focus on the DORA “core four”: deployment frequency, lead time, mean time to recovery, and change failure rate.
We’ve built out a multi-step process to reach these goals. First, we measure and publish DORA metrics for over 200 of our cloud services. Secondly, we define best practices to help teams deploy faster—from once per sprint up to multiple times per day.
Lastly, we collect data from Jira, GitHub, Jenkins, Spinnaker, and our CI/CD workflows to create a set of pipeline insights. Teams can then use these insights to identify problem areas in their workflows. After they adjust their workflows, teams should see deployment frequency, lead time, and other metrics also improve.
We know we don’t live in a perfect developer world—yet. There's complexity and there are problems. It’s not the place most companies are at. It’s only possible with a GitOps approach and fully-automated CI/CD pipelines that handle all artifacts in your product. It also requires having ways to use the data collected from tools and automation to identify what breaks, what takes you away from coding, and what’s getting in the way of future DevOps performance.
Between now and then, you need insights to figure out where and what needs to be optimized. Figure out what needs to be changed in your culture, processes, and tools so you can get to a world where you can forget about automation, and it all just works.