02 Dec

In 2016, can DevOps keep pace with consumer expectations?

Consumer Expectations

Every morning, I roll out of bed at 8:01am and lazily reach for my phone to check my Apple News feed.   Sometimes, the app has the audacity to make me wait 2 full seconds before it refreshes my feed with 100 articles from dozens of sources around the web.

Unacceptable!  I want my news feed instantly.  I want it now.  I can feel myself get frustrated and I can feel my patience evaporate in a flash.  Then I take a step back: since when do I take personal offense to an application that doesn’t load instantly?  Why do I get flustered as if I was rear-ended by a car?

This is just a microcosm of current consumer expectations.   An iOS bug becomes a TechCrunch headline, a bad Facebook feature makes CNN’s front page, and an app that drains 0.1% more battery life becomes a human rights violation.

As a developer, imagine trying to adapt to these expectations.  Consumers want things to work instantly, smoothly, and intelligently.  They want everything to be perfect and work seamlessly or else they get frustrated, offended, and vocally upset.  “Ugh, why do I have to click TWO buttons to see this?”


Moving into 2016, what can developers do to meet or even exceed these expectations?  Simply releasing software faster will not necessarily mitigate risk, nor will it necessarily lead to a better product.  The key question to address is “how can I adapt my development process to exceed consumer expectations?”

It doesn’t matter how wonderful your new feature is if it degrades your application’s performance.  Product-market fit no longer means that your product merely serves a market, but that it must meet that market’s expectations for performance.

How you release a feature becomes as important as the feature itself.   DevOps in 2016 should be the year of the incremental rollout, whereby assessing your application’s response to a new feature becomes a prerequisite for a launch.  I am not strictly referring to local or staged testing, but actual testing in production.

I recently published a piece on feature flag driven development and I feel that this practice of compartmentalized release is essential for managing risk and meeting consumer expectations.  By flagging a feature (i.e wrapping it in a condition), you can deploy it off and then incrementally turn it on for particular users, assessing performance feedback along the way.

LaunchDarkly Feature Flag Rollouts

Managing Scalability with Rollouts

Let’s look at an example.  You are a developer launching a new feature that will require you to process hundreds of additional requests per second.  A few hundred more?  That’s no problem – you’ve built the infrastructure to scale and handle that load.  But, what if all your users fall in love with the new feature?  Bombarding you with thousands of requests.  How do you manage this?

Imagine that you were able to roll out your feature live to 10% of your users and then 20%… 30%….  Each step becomes a testing benchmark where you assess performance feedback and can scale accordingly.

More importantly, you mitigate unanticipated performance degradation and meet the consumer expectation of seamless application performance.

implementing rollouts

Of course, feature flagged rollouts are not the saving grace for every feature launch or app update, but they’re fast becoming essential for DevOps as consumer expectations continue to raise the bar for performance.

Managing risk does not need to be a transformative operational process.  It can be easily achieved by flagging your features and gradually releasing them to your users.  After all, there’s no better way to get genuine feedback than testing in a real environment.

11 Aug

Secret to Facebook’s Hacker Engineering Culture

Facebook’s engineering is legendary for its speed and execution. You too can be as quick and smart as Facebook, if you know their hacker engineering secret. Originally they lived by “Move Fast and Break Things”, which has now evolved with wisdom to “Move Fast With Stable Infra.” Speed is important, as is stability and providing a good experience to users.Facebook’s engineering Kent Beck wrote a great Facebook Note on how Facebook embraces reversibility to scale up. I highly recommend you read his entire post.

Facebook has a secret sauce: an in-house system called Gatekeeper that allows them to get quick feature feedback and quickly iterate based on feedback. Engineering changes are wrapped with a feature flag and pushed live to production. However, the features are live but off, then turned on via Gatekeeper to different users . Facebook’s seemingly simple system of separating deployment from rollout unlocks many powerful ways to move faster with more stability. All items in italics below are quotes from Kent Beck, followed by my analysis of how Facebook uses Gatekeeper.

Internal usage. Engineers can make a change, get feedback from thousands of employees using the change, and roll it back in an hour.

Initially, the engineer uses Gatekeeper to turn the feature on to internal users (only) . Interestingly, I’ve heard that Facebook is too large for changes to be effectively communicated EXCEPT by actually making the change. Instead of flurries of emails or blasts in chat rooms notifying other groups, Facebook engineers makes the code change and waits for impacted parties to notify them that something is broken, or fix their own dependencies. Separating changes from bigger releases with feature flags mean that any change can be rolled back at any time.

Staged rollout. We can begin deploying a change to a billion people and, if the metrics tank, take it back before problems affect most people using Facebook.

Staged rollout depends on feature flags to encapsulate a change and a feature flagging system (like Gatekeeper) to take it back.

Dynamic configuration. If an engineer has planned for it in the code, we can turn off an offending feature in production in seconds. Alternatively, we can dial features up and down in tiny increments (i.e. only 0.1% of people see the feature) to discover and avoid non-linear effects.

The key to turning features off in seconds (rather than hours or in best case, minutes) is “if the engineer has planned for it in the code”. By using feature flags to separate code deployment from functionality, Facebook can quickly kill malignant features. Without feature flags and Gatekeeper, Facebook would have to do a full redeployment.

Right hand side units. We can add a little bit of functionality to the website and turn it on and off in seconds, all without interfering with people’s primary interaction with NewsFeed.

Facebook smartly uses micro services and avoids monolithic code. Small changes in functionality, wrapped in feature flags, can quickly be toggled on and off using Gatekeeper.

Shadow production. We can experiment with new services under real load, from a tiny trickle to the whole flood, without affecting production.

Facebook pioneered dark launches, the ability to expose features to load without exposing them to users. I’ve heard that it’s impossible to simulate Facebook’s production load as it’s so large. Gatekeeper allows Facebook to control via feature flags load testing from user visibility.

Data-informed decisions. Data-informed decisions are inherently reversible. “We expect this feature to affect this metric. If it doesn’t, it’s gone.”

By wrapping a feature with a flag, it’s possible to isolate its effect on the system. Data-informed decision , tying an individual feature to metrics, is made possible by Gatekeeper and feature flags. Without feature flags, it’s impossible to see the impact of a change – if you release five features and twenty bug fixes at once, and engagement drops by 5%, what feature is to blame? Could one of the bug fixes actually have caused a 10% drop and one of the features a 15% gain? Only by separating out each change can true causation (not just correlation) be seen. Yammer also follows data-informed decision in its product development. Again, it’s necessary to have encapsulation of the feature to both have measurement as well as enable the rollback.

Advance countries. We can roll a feature out to a whole country, generate accurate feedback, and roll it back without affecting most of the people using Facebook.

Gatekeeper and feature flags, are enabling canary launches – using an entire country as “canary in a coal mine” to see if there are issues with a release. Rather than having a world-wide failure, Facebook can iterate quickly and rollback.

Soft launches. When we roll out a feature or application with a minimum of fanfare it can be pulled back with a minimum of public attention.

Facebook, after many misfires like Facebook Beacon, now follows Eric Ries (Don’t launch – separate out a marketing launch from a product launch). With feature flags, Facebook can get feedback from their own users, and control the story. Facebook has avoided the flameouts of Google, which has had epic failures with Google Wave, Google Buzz, and most recently Google Plus – all expensively launched, then expensively decommissioned. With feature flags and Gatekeeper, Facebook is always in control of who sees what when.

Want to be as smart as Facebook for developing software? Want to integrate reversibility, dark launches, data-informed decisions into your own development cycle? The smartest companies like Facebook, Medium, DropBox, and LinkedIn have in-house feature-flagging systems custom built for them. You can build your own system, or simply use LaunchDarkly, “Gatekeeper for everyone else”.



09 Jul

Canary release is the new beta

Are canary releases the new beta? What does beta even mean? Sean Murphy recently tweeted me:

Wow. Was Sean right? When I was an Engineering Manager at Vignette, I’d run beta programs for our new releases. The beta programs had a dual purpose. First, we wanted to get feedback on the stability and validity of our features. But the beta also fed marketing with happy reference customers for our launch announcement. Customers liked being part of a beta because it gave them early access to features they had been waiting for, as well as an opportunity to influence product direction.

What had changed? The word beta has been overloaded to mean “we’re not entirely ready for prime time, so please be patient”. Gmail was in beta for five years! At TripIt, we had a beta tag for multiple years.

Canary release – exposing features to some subset of users (whether it be opt-in, random rollout, or specific segments) is now used to describe what was once a beta.

  • Microsoft: In development of Windows 10, Microsoft used “canary” releases to test with internal users within Microsoft. Gabe Aul, who leads the Data & Fundamentals Team in the Operating Systems Group (OSG), said “our Canary ring probably sees 2X-3X as many builds as OSG because we catch problems in Canary and don’t push to OSG.”
  • Instagram: “Using ‘canary’ releases, updates go out to a subset of users at first, limiting the ability of buggy software to do damage.” Mike Krieger, Instagram co-founder and CTO, said he uses canary releases because “If stuff blows up it affects a very small percentage of people”.
  • Google: For Chrome, Google offers Chrome Canary, which it labels with “Get on the bleeding edge of the web, Google ChromeCanary has the newest of the new Chrome features. Be forewarned: it’s designed for developers and early adopters, and can sometimes break down completely.”

So yes, canary is the new beta.