19 Apr

Why Leading Companies Dark Launch

LaunchDarkly Dark Launch

When it comes to releasing new features, there is nothing worse than deploying a feature that cripples your application, degrades performance, and turns away customers.

With the rise of continuous delivery, software teams are embracing faster, more iterative feature releases.  It’s now imperative for teams to ensure their features will be well-received by customers and maintain their application’s performance.

This is why companies like Google, Facebook, and Amazon have embraced dark launching to ensure the efficacy of their feature releases and the stability of their app infrastructure.

Continue reading “Why Leading Companies Dark Launch” »

07 Nov

Feature Flag-Driven Development

This article provides a broad and comprehensive overview of feature flag driven development, from gradual rollouts to A/B testing.

A Typical Afternoon

It’s an uneventful Friday afternoon.  You’re ready to head home, hit the gym, and spend time with the family.  The last thing you’d want to do is deploy some new features to your users right?  That would be crazy talk.  “Let’s just wait for Monday.. it’s too risky.. what if things break?”  Deploying on Friday would inevitably mean working the weekend, but you’re brave, you’re the Zeus of the programming world, and you deploy it anyway.

In fact, you’re smiling and cranking up the music as you confidently stroll out of the office to the bus stop.  “Naive!” some would say, but you keep up the pace with a hop in your step.

A few hours later, you’re lounging on the couch, proudly sunk into the cushy leather.  “Bzz Bzz” as your phone rattles in your pocket.  You ignore it, but the Bzzing won’t stop.  With an irksome look, you begrudgingly grab your phone and see a slew of error messages from your tracker.  Oh no!  This is it.  The deploy failed.  The gamble has turned into a nightmare… Right?  Nope.

Without much thought and a timid roll of the eyes, you pull up LaunchDarkly and flip a switch.  The errors stop and your only annoyance is the 30 seconds missed of your show.

This is the power of feature flags.  On that Friday afternoon, you used LaunchDarkly to deploy a feature to 1% of your users, warmly wrapped in a feature flag.  You wanted to test to see how it performed.  If something bad happened, no problem.  You knew you could just flip the switch and the feature would be rolled back with only 1% of your users experiencing a few seconds of inconvenience.  This is just one of hundreds of use cases showcasing the power of feature flags.

Feature Flags Explained

Feature flags/toggles/controls are basically ways to control the full lifecycle of your features.  They allow you to manage components and compartmentalize risk.  You can do pretty cool things like roll out features to certain users, exclude groups from seeing a feature, A/B test, and much more.  Check out this video on canary launching to see the benefits of dark launching features.

How They Work

When a user loads a page, your application will use that user’s attributes to determine what features to show.  For example, if I am a BETA user and I log in to myexamplesite.com, I will see the brand new BETA feature.  However, all non-BETA users will just see the old feature.   The reason I see the BETA feature is that my user is grouped as BETA.


While there are many open source solutions, we can dig deeper into LaunchDarkly’s SAAS platform for feature flags.

In this example below, you will see an explanation of multivariate feature flags.  Multivariate means that you can serve multiple variations of a feature to different user segments.  For example, let’s say I have a purple, orange, and green landing page.  I can select which individual users or groups of users I would like to see each variation.

LaunchDarkly Multivariate Feature Flags

On the LaunchDarkly side of things, you would do the following:

  1. Create a feature flag called “Landing Page”
  2. Name three variations “Purple” “Orange” “Green”
  3. Select which users you want receive each variation (you can also serve to percentages: ‘60% of all users get the purple variation’)

On your side of things, you would do the following:

  1. Add the LaunchDarkly SDK to your platform
  2. Wrap your feature in a feature flag
  3. Call the SDK method to receive the value for that flag

It’s as simple as that.   You can check out the full documentation here.

Targeting Users

Above, we briefly covered how to target individual users and groups.  Let’s take a deeper look into why this is important.  Targeting gives you the power to personalize a user’s experience.

Imagine the ability to create a customized and rewarding experience for every user.  Here are a few notable use cases:

  • Plan Management (normal vs. premium) –  You can launch targeted features to users on different plans.  Want to add a new feature for a premium user?  Sure!  Just wrap the new feature in a flag and turn it on for premium users.  Want to extend that feature to normal users eventually?  No problem!  Just add normal users when you’re ready.
  • Early Access – Allow only opt-in or power users to experience the latest features.
  • Block Users – Exclude users or groups who you do not want to see a new feature.
  • And many more.

Managing Rollouts

If you’re deploying a brand new set of features, launching them to 100% of your users at once is a risky business.  In fact, testing things by giving all of your users access isn’t really a test.  A test should be the process of receiving incremental feedback from your users, making improvements, and gradually expanding your release to everyone.  This is where LaunchDarkly’s rollouts come in.  If you want to launch a new feature, you can start by rolling it out to 1% of your users.. then 5%.. then 20%.. then 50%.. then 100%.  If something goes wrong at the 1% rollout, you can instantly roll it back, make the improvements, and test it again.


This is the process of canary launching, whereby you test the efficacy of a new set of features before releasing it to everyone.  It also allows you to test how your features behave at different levels of traffic and incrementally refine your infrastructure to support the deployment.

Flag Driven Development

Feature flags/toggles/controls harness the power of test-driven development (TDD).   This is the process of releasing and iterating features quickly, testing those features, and making improvements.  Think of it as Lean UX methodology.  You release light features to receive market feedback.  You iterate on that feedback, make improvements, and redeploy.

Think of feature flag-driven development as a way to receive iterative market feedback on your product, rather than solely depend on isolated customer feedback.  It’s a way to test how your features perform in the real world and not just in an artificial test environment.

Feature Flag Driven Development - Waterfall Agile

In the world of waterfall development, you will typically see one continuous build that culminates in a single deploy.  After this deploy, you’ll receive feedback and fix some bugs, but you will likely need to restart the process for any major feature releases.

Agile is a bit more forgiving.  You can iteratively test small releases to your users, but this is best performed in a staging environment.  You typically will not release your product to market throughout the agile development process, as most of your testing will be internal and controlled.

Finally, lean UX codifies the process of releasing features to market throughout the development process.  These releases will likely be smaller in scale, but you’ll receive immediate market feedback. When you introduce feature flags into the equation, the process becomes even more efficient.

Continuous Delivery via Feature Flag Driven Development

Feature flags allow you to substantially mitigate the risk of releasing immature functionality.  If a release is having a negative impact, roll it back.  If it’s doing well, keep rolling it out.   This is like having a persistent undo button and a means to recalibrate and improve functionality.

More importantly, you can institutionalize this process within your development cycle.  Your team will develop a cadence for lean releases, where all new components and functionality are wrapped in feature flags.  You can easily test features, cultivate creativity, and reward bold advances – all without compromising the integrity of your platform.


This new development methodology also allows your marketing, design, and engineering teams to collaborate more frequently and more effectively.  With an agile approach, you will typically have one large planning cycle that will launch you into development.  You will then test your iterations on local groups or in a local environment that tries to simulate production.  However, you cannot substitute real market feedback.

Feature flag driven development allows you to quickly release iterations of your features to market, receive feedback, improve, and redeploy.  It allows you to roll out features to small segments of your users in order to mitigate risk all while receiving valuable feedback.  More importantly, your team will converge and collaborate based on real market feedback and make the necessary improvements to drive the product forward.

Feature Flag Driven A/B Testing

A/B testing is the practice of comparing different versions of a page to see which one performs better.  In the traditional sense, A/B testing has been used for mainly cosmetic changes.  These include layouts, element position, colors, and copy.   Typically, A/B testing is tied to a goal.  For example, you want to increase sign up conversions, so you use tools like Optimizely, Visual Web Optimizer, and Apptimize to test different layouts, buttons, and call to action language.  These tools work great, but what if you wanted to test backend-level functionality, completely new features, and sign-up flows?

This is where feature flag driven A/B testing comes into play.

LaunchDarkly Feature Flag A/B Testing

Because feature flags are implemented at the code level, you can control deep functional features and then target user segments.   For example, let’s say I want to test a new sign-up flow and welcome tutorial (see above). I can flag the new functional components so that only certain users will receive the new flow.

With a suite like LaunchDarkly, you can then analyze these feature tests using your Optimizely or New Relic goals.  This will allow you to see, for example, which sign up flow is generating better conversions or which check out flow is generating more revenue.

All in all, feature flag driven A/B testing enables companies to test robust functionality instead of just cosmetic changes.


11 Aug

Secret to Facebook’s Hacker Engineering Culture

Facebook’s engineering is legendary for its speed and execution. You too can be as quick and smart as Facebook, if you know their hacker engineering secret. Originally they lived by “Move Fast and Break Things”, which has now evolved with wisdom to “Move Fast With Stable Infra.” Speed is important, as is stability and providing a good experience to users.Facebook’s engineering Kent Beck wrote a great Facebook Note on how Facebook embraces reversibility to scale up. I highly recommend you read his entire post.

Facebook has a secret sauce: an in-house system called Gatekeeper that allows them to get quick feature feedback and quickly iterate based on feedback. Engineering changes are wrapped with a feature flag and pushed live to production. However, the features are live but off, then turned on via Gatekeeper to different users . Facebook’s seemingly simple system of separating deployment from rollout unlocks many powerful ways to move faster with more stability. All items in italics below are quotes from Kent Beck, followed by my analysis of how Facebook uses Gatekeeper.

Internal usage. Engineers can make a change, get feedback from thousands of employees using the change, and roll it back in an hour.

Initially, the engineer uses Gatekeeper to turn the feature on to internal users (only) . Interestingly, I’ve heard that Facebook is too large for changes to be effectively communicated EXCEPT by actually making the change. Instead of flurries of emails or blasts in chat rooms notifying other groups, Facebook engineers makes the code change and waits for impacted parties to notify them that something is broken, or fix their own dependencies. Separating changes from bigger releases with feature flags mean that any change can be rolled back at any time.

Staged rollout. We can begin deploying a change to a billion people and, if the metrics tank, take it back before problems affect most people using Facebook.

Staged rollout depends on feature flags to encapsulate a change and a feature flagging system (like Gatekeeper) to take it back.

Dynamic configuration. If an engineer has planned for it in the code, we can turn off an offending feature in production in seconds. Alternatively, we can dial features up and down in tiny increments (i.e. only 0.1% of people see the feature) to discover and avoid non-linear effects.

The key to turning features off in seconds (rather than hours or in best case, minutes) is “if the engineer has planned for it in the code”. By using feature flags to separate code deployment from functionality, Facebook can quickly kill malignant features. Without feature flags and Gatekeeper, Facebook would have to do a full redeployment.

Right hand side units. We can add a little bit of functionality to the website and turn it on and off in seconds, all without interfering with people’s primary interaction with NewsFeed.

Facebook smartly uses micro services and avoids monolithic code. Small changes in functionality, wrapped in feature flags, can quickly be toggled on and off using Gatekeeper.

Shadow production. We can experiment with new services under real load, from a tiny trickle to the whole flood, without affecting production.

Facebook pioneered dark launches, the ability to expose features to load without exposing them to users. I’ve heard that it’s impossible to simulate Facebook’s production load as it’s so large. Gatekeeper allows Facebook to control via feature flags load testing from user visibility.

Data-informed decisions. Data-informed decisions are inherently reversible. “We expect this feature to affect this metric. If it doesn’t, it’s gone.”

By wrapping a feature with a flag, it’s possible to isolate its effect on the system. Data-informed decision , tying an individual feature to metrics, is made possible by Gatekeeper and feature flags. Without feature flags, it’s impossible to see the impact of a change – if you release five features and twenty bug fixes at once, and engagement drops by 5%, what feature is to blame? Could one of the bug fixes actually have caused a 10% drop and one of the features a 15% gain? Only by separating out each change can true causation (not just correlation) be seen. Yammer also follows data-informed decision in its product development. Again, it’s necessary to have encapsulation of the feature to both have measurement as well as enable the rollback.

Advance countries. We can roll a feature out to a whole country, generate accurate feedback, and roll it back without affecting most of the people using Facebook.

Gatekeeper and feature flags, are enabling canary launches – using an entire country as “canary in a coal mine” to see if there are issues with a release. Rather than having a world-wide failure, Facebook can iterate quickly and rollback.

Soft launches. When we roll out a feature or application with a minimum of fanfare it can be pulled back with a minimum of public attention.

Facebook, after many misfires like Facebook Beacon, now follows Eric Ries (Don’t launch – separate out a marketing launch from a product launch). With feature flags, Facebook can get feedback from their own users, and control the story. Facebook has avoided the flameouts of Google, which has had epic failures with Google Wave, Google Buzz, and most recently Google Plus – all expensively launched, then expensively decommissioned. With feature flags and Gatekeeper, Facebook is always in control of who sees what when.

Want to be as smart as Facebook for developing software? Want to integrate reversibility, dark launches, data-informed decisions into your own development cycle? The smartest companies like Facebook, Medium, DropBox, and LinkedIn have in-house feature-flagging systems custom built for them. You can build your own system, or simply use LaunchDarkly, “Gatekeeper for everyone else”.



09 Jul

Canary release is the new beta

Are canary releases the new beta? What does beta even mean? Sean Murphy recently tweeted me:

Wow. Was Sean right? When I was an Engineering Manager at Vignette, I’d run beta programs for our new releases. The beta programs had a dual purpose. First, we wanted to get feedback on the stability and validity of our features. But the beta also fed marketing with happy reference customers for our launch announcement. Customers liked being part of a beta because it gave them early access to features they had been waiting for, as well as an opportunity to influence product direction.

What had changed? The word beta has been overloaded to mean “we’re not entirely ready for prime time, so please be patient”. Gmail was in beta for five years! At TripIt, we had a beta tag for multiple years.

Canary release – exposing features to some subset of users (whether it be opt-in, random rollout, or specific segments) is now used to describe what was once a beta.

  • Microsoft: In development of Windows 10, Microsoft used “canary” releases to test with internal users within Microsoft. Gabe Aul, who leads the Data & Fundamentals Team in the Operating Systems Group (OSG), said “our Canary ring probably sees 2X-3X as many builds as OSG because we catch problems in Canary and don’t push to OSG.”
  • Instagram: “Using ‘canary’ releases, updates go out to a subset of users at first, limiting the ability of buggy software to do damage.” Mike Krieger, Instagram co-founder and CTO, said he uses canary releases because “If stuff blows up it affects a very small percentage of people”.
  • Google: For Chrome, Google offers Chrome Canary, which it labels with “Get on the bleeding edge of the web, Google ChromeCanary has the newest of the new Chrome features. Be forewarned: it’s designed for developers and early adopters, and can sometimes break down completely.”

So yes, canary is the new beta.