In which Etsy transforms its app release process by aligning it with its philosophy for web deploys
Deploying code should be easy. It should happen often, and it should involve its engineers. For Etsyweb, this looks like continuous deployment.
A group of engineers (which we call a push train) and a designated driver all shepherd their changes to a staging environment, and then to production. At each checkpoint along that journey, the members of the push train are responsible for testing their changes, sharing that they’re ready to ship, and making sure nothing broke. Everyone in that train must work together for the safe completion of their deployment. And this happens very frequently: up to 50 times a day.
TOPIC: clear mittens> .join TOPIC: mittens sasha> .join with mittens TOPIC: mittens + sasha pushbot> mittens, sasha: You're up TOPIC: mittens + sasha sasha> .good TOPIC: mittens + sasha* mittens> .good TOPIC: mittens* + sasha* pushbot> mittens, sasha: Everyone is ready TOPIC: mittens* + sasha* nassim> .join TOPIC: mittens* + sasha* | nassim mittens> .at preprod TOPIC: <preprod> mittens + sasha | nassim mittens> .good TOPIC: <preprod> mittens* + sasha | nassim sasha> .good TOPIC: <preprod> mittens* + sasha* | nassim pushbot> mittens, sasha: Everyone is ready TOPIC: <preprod> mittens* + sasha* | nassim mittens> .at prod TOPIC: <prod> mittens + sasha | nassim mittens> .good TOPIC: <prod> mittens* + sasha | nassim asm> .join TOPIC: <prod> mittens* + sasha | nassim + asm sasha> .good TOPIC: <prod> mittens* + sasha* | nassim + asm asm> .nm TOPIC: <prod> mittens* + sasha* | nassim pushbot> mittens, sasha: Everyone is ready TOPIC: <prod> mittens* + sasha* | nassim mittens> .done TOPIC: nassim pushbot> nassim: You're up TOPIC: nassim lily> .join TOPIC: nassim | lily
This strategy has been successful for a lot of reasons, but especially because each deploy is handled by the people most familiar with the changes that are shipping. Those that wrote the code are in the best position to recognize it breaking, and then fix it. Because of that, developers should be empowered to deploy code as needed, and remain close to its rollout.
App releases are a different beast. They don’t easily adapt to that philosophy of deploying code. For one, they have versions and need to be compiled. And since they’re distributed via app stores, those versions can take time to reach end users. Traditionally, these traits have led to strategies involving release branches and release managers. Our app releases started out this way, but we learned quickly that they didn’t feel very Etsy. And so we set out to change them.
Jen and Sasha
We were the release managers. Jen managed the Sell on Etsy apps, and I managed the Etsy apps. We were responsible for all release stage transitions, maintaining the schedule, and managing all the communications around releases. We were also responsible for resolving conflicts and coordinating cross-team resources in cases of bugs and urgent blockers to release.
Ready to Ship
A key part of our job was making sure everyone knew what they’re supposed to do and when they’re supposed to do it. The biggest such checkpoint is when a release branches — this is when we create a dedicated branch for the release off master, and master becomes the next release. This is scheduled and determines what changes make it into production for a given release. It’s very important to make sure that those changes are expected, and that they have been tested.
For Jen and me, it would’ve been impossible to keep track of the many changes in a release ourselves, and so it was our job to coordinate with the engineers that made the actual changes and make sure those changes were expected and tested. In practice, this meant sending emails or messaging folks when approaching certain checkpoints like branching. And likewise, if there were any storm warnings (such as show-stopping bugs), it was our responsibility to raise the flag to notify others.
Then Jen left Etsy for another opportunity, and I became a single-point-of-failure and a gatekeeper. Every release decision was funneled through me, and I was the only person able to make and execute those decisions.
I was overwhelmed. Frustrated. I was worried I’d be stuck navigating iTunes Connect and Google Play, and sending emails. And frankly, I didn’t want to be doing those things. I wanted those things to be automated. Give me a button to upload to iTunes Connect, and another to begin staged rollout on Google Play. Thinking about the ease of deploying on web just filled me with envy.
This time wasn’t easy for engineers either. Even back when we had two release managers, from an engineer’s perspective, this period of app releases wasn’t transparent. It was difficult to know what phase of release we were in. A large number of emails was sent, but few of them were targeted to those that actually needed them. We would generically send emails to one big list that included all four of our apps. And all kinds of emails would get sent there. Things that were FYI-only, and also things that required urgent attention. We were on the path to alert-fatigue.
All of this meant that engineers felt more like they were in the cargo hold, rather than in the cockpit. But that just didn’t fit with how we do things for web. It didn’t fit with our philosophy for deployment. We didn’t like it. We wanted something better, something that placed engineers in front of the tiller.
So we built a vessel that coordinates the status, schedule, communications, and deploy tools for app releases. Here’s how Ship helps:
- Keeps track of who committed changes to a release
- Sends Slack messages and emails to the right people about the relevant events
- Manages the state and schedule of all releases
It’s hard to imagine all of that abstractly, so here’s an example:
- Alicia makes her first commit to the iOS app for v4.64.0.
- Ship gets notified of this and sends Alicia an email welcoming her to v4.64.0.
- A cron moves the release into “Testing” and generates testing build v184.108.40.206.
- Ship is notified of this and sends an email to Alicia with the build.
- Alicia installs the build, verifies her changes, and tells Ship she’s ready.
- Everyone has tested their changes and reported themselves as ready.
- A cron branches the release and creates a release candidate.
- Ship is notified and sends an email to coordinate final testing of the release.
- The final testing finds no show-stopping issues
- A cron submits v4.64.0 to iTunes Connect for review.
- A cron checks iTunes Connect for the review status of this release, and updates Ship that it’s been approved.
- Ship emails Alicia and others letting them know the release is approved.
- A cron releases v4.64.0.
(Had Alicia committed to our Android app, a cron would instead begin staged rollout on Google Play.)
- Ship emails Alicia and others letting them know the release is out in production.
- Ship emails a report of top crashes to all the engineers in the release (including Alicia)
Before Ship, all of these components above would’ve been performed manually. But you’ll notice that release managers are missing from the above script; have we replaced release managers with all the automations in Ship?
Partially. Ship has a feature where each release is assigned a driver.
This driver is responsible for a bunch of things that we couldn’t or shouldn’t automate. Here’s what they’re responsible for:
- Schedule changes
- Shepherding ‘ready to ships’ from other engineers
- Investigating showstopping bugs before release
Everything else? That’s automated. Branching, release candidate generation, submission to iTunes Connect — even staged rollout on Google Play! But, we’ve learned from automation going awry before. By default, some things are set to manual. There are others for which Ship explicitly does not allow automation, such as continuing staged rollout on Google Play. Things like this should involve and require human interaction. For everything else that is automated, we added a failsafe: at any time, a driver can disable all the crons and take over driving from autopilot:
When a driver wants to do something manually, they don’t need access to iTunes Connect or Google Play, as each of these things is made accessible as a button. A really nice side effect of this is that we don’t have to worry about provisioning folks for either app store, and we have a clear log of every release-related action taken by drivers.
Drivers are assigned once a release moves onto master, and are semi-randomly selected based on previous drivers and engineers that have committed to previous releases. Once assigned, we send them an onboarding email letting them know what their responsibilities are:
Ready to Ship Again
The driver can remain mostly dormant until the day of branching. A couple hours before we branch, it’s the driver’s responsibility to make sure that all the impacting engineers are ready to ship, and to orchestrate efforts when they’re not. After we’re ready, the driver’s responsibility is to remain available as a point-of-contact while final testing takes place. If an issue comes up, the driver may be consulted for steps to resolve.
And then, assuming all goes well, comes release day. The driver can opt to manually release, or let the cron do this for them — they’ll get notified if something goes wrong, either way. Then a day after we release, the driver looks at all of our dashboards, logs, and graphs to confirm the health of the release.
But not all releases are planned. Things fail, and that’s expected. It’s naïve to assume some serious bug won’t ship with an app release. There’s plenty of things that can and will be the subject of a post-mortem. When one of those things happens, any engineer can spawn a bugfix release off the most-recently-released mainline release.
The engineer that requests this bugfix gets assigned as the driver for that release. Once they branch the release, they make the necessary bugfixes (others can join in to add bugfixes too, if they coordinate with the driver) in the release’s branch, build a release candidate, test it, and get it ready for production. The driver can then release it at will.
Releases are actually quite complicated.
It starts off as an abstract thing that will occur in the future. Then becomes a concrete thing actively collecting changes via commits on master in git. After this period of collecting commits, the release is considered complete and moves into its own dedicated branch. The release candidate is then built from this dedicated branch, which then gets thoroughly tested, and moved into production. The release itself then concludes as an unmerged branch.
Once a release branches, the next future release moves onto master. Each release is its own state machine, where the development and branching states overlap between successive releases.
Notifications: Slack and Email
Plugged into the output of Ship are notifications. Because there are so many points of interest en route to production, it’s really important that the right people are notified at the right times. So we use the state machine of Ship to send out notifications to engineers (and other subscribers) based on how much they asked to know, and how they impacted the release. We also allow anyone to sign up for notifications around a release. This is used by product managers, designers, support teams, engineering managers, and more. Our communications are very targeted to those that need or want them.
In terms of what they asked to know, we made it very simple to get detailed emails about state changes to a release:
In terms of how they impacted the release, we need to get that data from somewhere else.
We mentioned data Ship receives from outside sources. At Etsy, we use GitHub for our source control. Our apps have repos per-platform (Android and iOS). In order to keep Ship’s knowledge of releases up-to-date, we set up GitHub Webhooks to notify Ship whenever changes are pushed to the repo. We listen for two changes in particular: pushes to master, and pushes to any release branch.
When Ship gets notified, it iterates through the commits and uses the author, changed paths, and commit message to determine which app (buyer or seller) the commit affects, and which release we should attribute this change to. Ship then takes all of that and combines it into a state that represents every engineer’s impact on a given release. Is that engineer “user-impacting” or “dark” (our term for changes that aren’t live)? Ship then uses this state to determine who is a member of what release, and who should get notified about what events.
Additionally, at any point during a release, an engineer can change their status. They may want to do this if they want to receive more information about a release, or if Ship misunderstood one of their commits as being impacting to the release.
Everything up until has explained how Ship keeps track of things. But there’s been no explanation for how some of the automated actions affecting the app repo or things outside Etsy occur.
We have a home-grown tool for managing deploys called Deployinator, and we added app support. It can now perform mutating interactions with the app repos, as well as all the deploy actions related to Google Play and iTunes Connect. This is where we build the testing candidates, release candidate, branch the release, submit to iTunes Connect, and much more.
We opted to use Deployinator for a number of reasons:
- Etsy engineers are already familiar with it
- It’s our go-to environment for wrapping up a build process into a button
- Good for things that need individual run logs, and clear failures
In our custom stack, we have crons. This is how we branch on Tuesday evening (assuming everyone is ready). This is where we interface with Google Play and iTunes Connect. We make use of Google Play’s official API in a custom python module we wrote, and for iTunes Connect we use Spaceship to interface with the unofficial API.
The end result of Ship is that we’ve distributed release management. Etsy no longer has any dedicated release managers. But it does have an engineer who used to be one — and I even get to drive a release every now and then.
People cannot be fully automated away. That applies to our web deploys, and is equally true for app releases. Our new process works within that reality. It’s unique because it pushes the limit of what we thought could be automated. Yet, at the same time, it empowers our app engineers more than ever before. Engineers control when a release goes to prod. Engineers decide if we’re ready to branch. Engineers hit the buttons.
And that’s what Ship is really about. It empowers our engineers to deliver the best apps for our users. Ship puts engineers at the helm.