DevOpsWest is run concurrently with AgileWest and had a good mix of international and domestic attendees, who wanted to get better at software. As one attendee from Florida said “I’m not working at Google or Facebook because I want to be in Florida, but I still want to be as sharp and nimble as them.” And the fact that it allowed people from London and India to visit Las Vegas seemed to be a selling point too. Continue reading “DevOps West – DevOps goes international!” »
The evolution of feature toggles to feature flags and what it holds for the future of software development.
History of the Toggle
Jez Humble and Martin Fowler are best known for promoting the separation of feature rollout from code deployment. Within the context of continuous delivery, they provided the foundation for a framework that would allow developers to release software faster and with less risk.
Fowler is well-known for championing the notion of feature toggles, which are ways to wrap features in conditionals that allow you to toggle features on and off for users. This would enable developers to take full control of their feature rollouts, dark launch, and roll back poorly performing features.
When it comes to releasing new features, there is nothing worse than deploying a feature that cripples your application, degrades performance, and turns away customers.
With the rise of continuous delivery, software teams are embracing faster, more iterative feature releases. It’s now imperative for teams to ensure their features will be well-received by customers and maintain their application’s performance.
This is why companies like Google, Facebook, and Amazon have embraced dark launching to ensure the efficacy of their feature releases and the stability of their app infrastructure.
Facebook’s engineering is legendary for its speed and execution. You too can be as quick and smart as Facebook, if you know their hacker engineering secret. Originally they lived by “Move Fast and Break Things”, which has now evolved with wisdom to “Move Fast With Stable Infra.” Speed is important, as is stability and providing a good experience to users.Facebook’s engineering Kent Beck wrote a great Facebook Note on how Facebook embraces reversibility to scale up. I highly recommend you read his entire post.
Facebook has a secret sauce: an in-house system called Gatekeeper that allows them to get quick feature feedback and quickly iterate based on feedback. Engineering changes are wrapped with a feature flag and pushed live to production. However, the features are live but off, then turned on via Gatekeeper to different users . Facebook’s seemingly simple system of separating deployment from rollout unlocks many powerful ways to move faster with more stability. All items in italics below are quotes from Kent Beck, followed by my analysis of how Facebook uses Gatekeeper.
Internal usage. Engineers can make a change, get feedback from thousands of employees using the change, and roll it back in an hour.
Initially, the engineer uses Gatekeeper to turn the feature on to internal users (only) . Interestingly, I’ve heard that Facebook is too large for changes to be effectively communicated EXCEPT by actually making the change. Instead of flurries of emails or blasts in chat rooms notifying other groups, Facebook engineers makes the code change and waits for impacted parties to notify them that something is broken, or fix their own dependencies. Separating changes from bigger releases with feature flags mean that any change can be rolled back at any time.
Staged rollout. We can begin deploying a change to a billion people and, if the metrics tank, take it back before problems affect most people using Facebook.
Staged rollout depends on feature flags to encapsulate a change and a feature flagging system (like Gatekeeper) to take it back.
Dynamic configuration. If an engineer has planned for it in the code, we can turn off an offending feature in production in seconds. Alternatively, we can dial features up and down in tiny increments (i.e. only 0.1% of people see the feature) to discover and avoid non-linear effects.
The key to turning features off in seconds (rather than hours or in best case, minutes) is “if the engineer has planned for it in the code”. By using feature flags to separate code deployment from functionality, Facebook can quickly kill malignant features. Without feature flags and Gatekeeper, Facebook would have to do a full redeployment.
Right hand side units. We can add a little bit of functionality to the website and turn it on and off in seconds, all without interfering with people’s primary interaction with NewsFeed.
Facebook smartly uses micro services and avoids monolithic code. Small changes in functionality, wrapped in feature flags, can quickly be toggled on and off using Gatekeeper.
Shadow production. We can experiment with new services under real load, from a tiny trickle to the whole flood, without affecting production.
Facebook pioneered dark launches, the ability to expose features to load without exposing them to users. I’ve heard that it’s impossible to simulate Facebook’s production load as it’s so large. Gatekeeper allows Facebook to control via feature flags load testing from user visibility.
Data-informed decisions. Data-informed decisions are inherently reversible. “We expect this feature to affect this metric. If it doesn’t, it’s gone.”
By wrapping a feature with a flag, it’s possible to isolate its effect on the system. Data-informed decision , tying an individual feature to metrics, is made possible by Gatekeeper and feature flags. Without feature flags, it’s impossible to see the impact of a change – if you release five features and twenty bug fixes at once, and engagement drops by 5%, what feature is to blame? Could one of the bug fixes actually have caused a 10% drop and one of the features a 15% gain? Only by separating out each change can true causation (not just correlation) be seen. Yammer also follows data-informed decision in its product development. Again, it’s necessary to have encapsulation of the feature to both have measurement as well as enable the rollback.
Advance countries. We can roll a feature out to a whole country, generate accurate feedback, and roll it back without affecting most of the people using Facebook.
Gatekeeper and feature flags, are enabling canary launches – using an entire country as “canary in a coal mine” to see if there are issues with a release. Rather than having a world-wide failure, Facebook can iterate quickly and rollback.
Soft launches. When we roll out a feature or application with a minimum of fanfare it can be pulled back with a minimum of public attention.
Facebook, after many misfires like Facebook Beacon, now follows Eric Ries (Don’t launch – separate out a marketing launch from a product launch). With feature flags, Facebook can get feedback from their own users, and control the story. Facebook has avoided the flameouts of Google, which has had epic failures with Google Wave, Google Buzz, and most recently Google Plus – all expensively launched, then expensively decommissioned. With feature flags and Gatekeeper, Facebook is always in control of who sees what when.
Want to be as smart as Facebook for developing software? Want to integrate reversibility, dark launches, data-informed decisions into your own development cycle? The smartest companies like Facebook, Medium, DropBox, and LinkedIn have in-house feature-flagging systems custom built for them. You can build your own system, or simply use LaunchDarkly, “Gatekeeper for everyone else”.
LAUNCHDARKLY HELPS YOU BUILD BETTER SOFTWARE FASTER WITH FEATURE FLAGS AS A SERVICE. START YOUR FREE TRIAL NOW.
We hosted the first Dark Launching meetup in May with a surprisingly large turnout. We’d originally planned the meetup to be a user group of our current LaunchDarkly users sharing how they were using LaunchDarkly for dark launches. We were very pleasantly surprised by how many people joined our meetup as they wanted to learn more about dark launching itself! Dark launching is a best practice used by Facebook to launch new features “dark”(off), then slowly light up (turn on) features for different users. A key component of dark launches is the ability to easily turn a feature off again if issues are found.
Why dark launch? Shouldn’t all releases always be throughly tested and production ready? No matter how good or thorough your QA and performance testing is, you will never find all issues before production. Dark launches are a recognition that the real world exists. Rather than exposing your release into the full light of day to get blasted, shouldn’t you control access yourself?
I started the Dark Launch meetup by asking for a show of hands from everyone who’d had a bad release. Every hand went up, way up. I then asked if anyone wanted to share their story of a bad release. Everyone put their hands back down. I think there’s too much shame attached to bad releases, when we should 1) admit that bad releases happen to good teams 2) learn from bad releases 3) put process like dark launches into place to help mitigate bad releases.
So I’ll share a tale of two releases – one where I used dark launching, and one where I should have used dark launching. At TripIt, our users emailed us their email travel confirmations. We’d parse the emails, extract the useful information like flight and hotel, and make them a beautiful trip itinerary. Our users loved us, we were the number #1 Travel app in the App Store. I was the product manager on a daring new feature – users would connect their gmail accounts directly to TripIt, skipping the step of manually forwarding itineraries. It was considered extremely risky to have people authorize us to scan their email inbox. So I found a group of frequent travelers who were willing to give us feedback. We pushed the feature “live” to production, but only granted the opted-in users access, as a dark launch.
Internally, we carefully monitored our speed scanning real world inboxes. I also followed up on all users on whether we were importing the right (or wrong) email items. Some early mistakes were being too aggressive with the word “ticket” and importing a Tiffany order, or a Turkish Airlines promotion. When we’d gotten enough real world feedback, we then truly launched, with a TechCrunch article. I was pretty excited both that we’d launched something to help our users as well as my picture making it into TechCrunch!
We didn’t dark launch another email feature at Tripit, with very bad results. A persistent user complaint to TripIt is we would import ALL travel confirmations, whether or not you were the actual traveler. We called this “Other People’s Travel” (OPT). For example, if my sister Margaret Harbaugh emailed me her flight information – bam! her trip would appear in my TripIt account. Our system recognized that there was a travel plan, but there was no way for our system to tell the difference between Margaret and Edith. Or was there? Our engineers implemented some logic to compare names on the account with names on the itinerary and if there wasn’t a close enough match, to not import the trip. Sounds easy – “Margaret” is clearly different than “Edith”, so let’s ship it, right? However, we quickly had a flood of complaints. We had over a million users using auto-import (the largest gmail authorized company worldwide) . It turns out that email itineraries often had very odd concatenations like MsEdith or EdithMs, etc, which TripIt was rejecting as not the same as “Edith”. We were skipping too many emails and our users were very unhappy. They’d relied on us to “automagically” work, and they had tolerated some false positive imports. Now we were ignoring may emails they expected us to import. We had to do an emergency patch and quickly revert the change. If we’d done a dark launch, we could have tested the change with a smaller batch of users and monitor their reaction. It was a lesson learned to me that dark launching is a powerful tool to ensure user satisfaction.
We’re looking forward to our next dark launching meetup to share more stories and lessons learned. You can join the Dark Launch meetup group here, and we hope to see you soon!
LAUNCHDARKLY HELPS YOU BUILD BETTER SOFTWARE FASTER WITH FEATURE FLAGS AS A SERVICE. START YOUR FREE TRIAL NOW.