29 Nov

Toggle Talk with Damian Brady

I sat down with Damian Brady, Solution Architect at Octopus Deploy for a conversation about his experience with feature toggles.  He shared with me his tips for best practices, philosophies on when to flag and what he thinks the future of feature flagging will look like. 

  • How long have you been feature flagging?

I had to think about this one a bit – about 8 years ago but I probably didn’t know what it was called at the time.

“It’s definitely the case that people are doing this without knowing the name “feature flag” or even giving it a name. They’re just saying it’s a configuration switch or a toggle and but not giving it a more proper name, they’re not identifying it as a first-class citizen really.”   

  • What do you prefer to call it and why?

Now I call it feature flagging or occasionally feature toggles. I think toggles makes a bit more sense as analogy for non-technical people.  

  • When do you think feature flagging is most useful?

There’s a couple – but the one I think it’s most useful for is to use a feature flag when you have a feature that is nearly complete or complete from your point of view. Either way, you are ready to get verification from someone with real data.

“You can test as much as you want with your pretend fake data, or even a dump from production which is being obfuscated, but until it gets used in the wild you’re never really sure that the feature is doing exactly what it needs to do.”  

So hiding that behind a feature flag, and then clicking it on for somebody who is using the product for real in any way gives you that last little test that is ultimately the most useful.  At that point you still have the opportunity to back out. If something was corrupt or your expectations were wrong, it’s really useful for that last-minute check.  

At Octopus, we’ve started using feature flags for big features that a lot of people don’t want to see. So a while ago we introduced the idea of a multi-tenant deployments. And probably most of our users don’t need that feature because it adds a lot of complexity to the UI.  We have a configuration section where you can toggle an “on” and “off” switch, so if you don’t need that feature you can just leave it off.     

Are there any cases where feature flagging is not a good idea?

I think there are two extremes where feature flagging is not a good idea. On one hand, flagging really small changes can be more trouble than it’s worth. It’s introducing an extra level of complexity that maybe for a small change is not critical.  

On the other side, using feature toggles around the architectural changes in the core of your application – that’s kind of hard to test. Do you have a feature flag that when you turn it “on” it completely redirects the way the entire application will run? In that case you bite the bullet and decide that this is a big change and you’re just going to have to test it very thoroughly and not give yourself a way out.

That being said, there are some cases where you still need to give yourself a way out by using a flag. For example, you might deploy some new feature thinking it’s correct, but subsequently learn from a customer or user that it doesn’t really meet their needs. Rather than the user living with a bad feature, you might want to turn the flag off and go back to the drawing board.

If it’s an architectural change, you may only find out that there’s a bug when you use it in production. Test data may not surface the issue properly.

Ultimately, doing core architecture changes in a way where you can back out later can be an extra huge amount of work. It’s probably at that point you know you aren’t going to do it (revert back) anyway.   

  • Best use of a feature flag – a personal story?

When I first started using feature flags, around the 8 years ago timeframe, I was working on a web application that was internal and a big line of business.  And we had just added a new third party provider for providing SMS.  And with this new provider, it meant we had to write a lot of new code.  It was internet banking software so it was a one-time password we were sending out – and it was really really important that it work.

We tested everything rigorously but wanted more insurance.  So we put the new service behind a feature flag. We had a bunch of agents that ran this type of SMS. We enabled a flag for one of the agents and monitored it to make sure it was actually doing the right thing and not failing. And then we started trying other ones. It failed a couple of times because of differences in the sandbox environment between the third-party provider and the real one.

“We thought everything was okay, but when we put it live we turned it on slowly, and it didn’t do what we expected.”

So when that happened we turned it back off again…and went back to the drawing board.

So without the feature flag, we would have dropped every person using the service at that critical point. That client would have not been able to receive SMS’s until we were able to rollback.

  • What do you think is the number one mistake that’s made around feature flagging?

There is one that I keep seeing – when you wrap a new feature you believe to be finished in a flag, the biggest mistake with this is not testing that change with the flag “on” and then “off”. For instance, when you turn it “on” it snaps into new database tables or starts changing the way data is saved. But when you turn it “off” again, you’ve lost that data or data is corrupt. For this you need to test it “on” and “off”.  

“If you have more than one feature flag running at the same time, test the combinations of them being both ‘on’ and ‘off’.

If they’re likely to interact with each other you need to test “one on, one off,” “both on,” “both off” and all possible combinations like this.  

  • How do you think feature flags play into the DevOps movement? What about Continuous Delivery?

I think feature flags play in both continuous delivery and continuous deployment. I think they’re most useful to continuous deployment. You have all of your features pushed out to production as soon as they compile essentially – but they are behind a feature flag so you don’t break anything. That’s the way Facebook does it. They know that any new code they write might end up in production so it’s going to be safe behind a feature flag.  

“The design of the DevOps movement, the aim of it really, is to get real features and real value in the hands of users as quickly as possible.”

So if you have to wait until this “half done piece of work” is actually safe to deploy then that slows you down. So having it behind a feature flag so that it doesn’t get touched until you are ready to test it can be really powerful for increasing velocity and getting things out to production much faster.     

So even for marketing teams, it means they don’t have to tell the developers “hey we worked out the result of this a/b test and we want option b.” If the marketing department can just just flip that switch and say “no, option b is working better so just leave it there” without a new deployment or contacting the developers to remove the old stuff and redirect to the new stuff,  that increases that team effort of getting value to customers which is the whole purpose of DevOps.

  • Can you share any tips for better flagging?

If you’re feature flagging a big change, pair the feature flag with a branch by abstraction pattern. See the clip from my talk from NDC Sydney for more details.

There’s also the concept of transitional deployments – again refer to my video clip here for more. It’s useful for things like database schema changes where you have a midpoint for both the new and old applications that will work with the schema that’s currently there. So you can turn that feature off if you need to.

  • Are you seeing feature flagging evolving? If so how?  And how do you expect it to change in the future?

It’s been around for a long time…but I think it’s becoming much more visible – and partly it’s LaunchDarkly helping with that. I think more people will start using feature flags in their continuous delivery pipeline. And the more continuous delivery becomes mainstream, the more mainstream developers will need feature flags.  

“I think feature flagging is starting to be something that you have to add your deployment cycle because you know it needs to be fast and you know you feature needs to get to production as quickly as possible – and feature flags are the way to do that.”  

So as it becomes more mainstream I think there will be more tools, more frameworks, more awareness of it (feature flags) as it hits more and more companies. I think there will be things coming out like feature flag-aware testing tools – so testing tools that know that they need to test with this flag on and off.  

The summary – more tools around best practices around this thing which is becoming more mainstream.  With DevOps becoming more popular, more people are thinking “yes we need to get to production quicker, we need that cycle time to reduce” so it’s a natural extension I think to start solving some of those problems with feature flags.  

“I think it’s just starting to become more mainstream frankly because it’s a solution to a problem that is starting to become more mainstream.”   

31 Aug

Secrets of Netflix’s Engineering Culture

Netflix is not just known for the cultural phenomena of “Netflix and chill”, but for its legendary engineering team that releases hundreds of times a day in a data-driven culture. Netflix is the undisputed winner in the video wars, having driven Blockbuster into the “return” bin of history. Netflix won by iterating quickly and innovating with numerous micro-deployments. Could what worked for Netflix work for you?

Netflix had a virtuous cycle of product innovation. Every change made in the product is with the goal of getting new users to become subscribers. Netflix has a constant flow of new users every month, so they always have new users to test on. Also, they have a vast store of past data to optimize on. Did someone who liked “Princess Bride” also like “Monty Python and the Holy Grail”? When is the right time to prompt for a subscription? Interesting tests that Netflix can run include whether TV ads drive Netflix signups, or whether requiring Facebook to create an account drives enough social activity to counteract the drop in subscriptions from people who don’t have Facebook. If a change increased new user subscriptions, it went into the product. If it didn’t increase new user subscriptions, it didn’t make it in – hypothesis driven development.

However, what if you’re not Netflix? What if you’re a steady SaaS business with 1,000 business customers, on boarding 30 new customers a month? This is a healthy business, doubling in size annually. However what if you wanted to test whether you get more subscriptions with a one step or two step process to add a credit card. With a sample set of 30 a month & 90% current success rate, it will take you three months to determine success. Not everything can be tested at small scale. Tomasz Tunguz talks more about the perils of testing early here.

The other “gotcha” to watch out for with Netflix style development is obsessive focus on one metric can degrade other metrics. For example, focusing on optimizing new user signup might mean degrading experience for old users. Let’s say that 10,000 customers could be served with “good” speed, or 2,000 with “superfast” speed and 8,000 with “not good speed”. Or 1,000 with lightning fast and 9,000 with terrible speed. You might make the 1,000 new customers very happy, but piss off the 9,000 existing customers and have them quit. A good counterweight is to always have a contra-metric to keep an eye on. It’s okay if it dips slightly if the main metric rises. However, if the other metric tanks, re-consider whether the overall gains are worth it.

So what lessons can you take from Netflix to help your own business?

One, have a clear idea of why you’re making changes, even if it’s not something that you can a/b test. Is it to increase stability in your system? Make it quicker for someone to onboard? Know what your success criteria are, even if there’s not a statistically significant “winner”.

Two, break down projects into easily quantifiable chunks of value. Velocity can be as important (if not more important) than always being right. For example if you try 20 small changes, and half are right, you’ll end up 50% better. If you try one big change, and it’s not accretive, you’ll end up with a zero percent gain. Or, as Adrian Cockcroft, Netflix Architect says “If you’re doing quarterly releases and your competitor is doing daily releases you will fall so far behind”.

Three, don’t underestimate the importance of your own domain expertise. If you’re constantly testing ideas, even without having enough data, you’re quicker to get into the right path. Let your competitors copy your past mistakes, while you move forward. As Kris Gale, co-founder and CTO of Clover Health said, “You will always make better decisions with more information, and you will always have more information in the future.” But the way to get more information is to iterate.

LAUNCHDARKLY HELPS YOU BUILD BETTER SOFTWARE FASTER BY HELPING MANAGE FEATURE FLAGS AT SCALE. START YOUR FREE TRIAL NOW.

 

26 Aug

3 Ways to Avoid Technical Debt when Feature Flagging

Feature flags are a valuable technique of separating out release (deployment) from visibility. Feature flags allow a software organization to turn features on and off at a high level, as well as segment out their base to allow different users different levels of access. However, feature flags have an (ill-deserved) reputation of “Technical Debt”. Used incorrectly, feature flags can accumulate, add complexity, and even break your system. Used correctly, feature flags can help you move faster. Here’s three easy ways you can avoid technical debt when using feature flags.

  1. Create a central repository for feature flags

Using config files for feature flags is “the junk drawer” of technical debt. If you have seven config files with different flags for different parts of the system, it’s hard to know what flags exist, or how they interact. Have one place where you manage all of your feature flags.

  1. Avoid ambiguously named flags

Give your flags easy to understand, intuitive names. Assume that someone other than you and your flag could potentially be using this flag days, months, and years into the future. Don’t have a name that could cause someone to turn it on when they mean off, or vice versa. For example “FilterUser”, when it’s off – does this mean users are filtered? or not?

  1. Have a plan for flag removal

Some flags are meant for permanent control, for example for an entitlements system. Other flags are temporary, meant for the purpose of a release only. If a flag can be removed (because it’s serving 100% or 0% of traffic), it should be removed, as quickly as possible. To enforce this rigor, when you write the flag, also write the pull request to remove it. That way, when it’s time to remove the flag, it’s a two second task.

02 Aug

Feature Flag-Driven Releases

Feature flag strategy meeting

Using feature flags to incorporate a release strategy into your development process

 

The Old Way

The “old way” of software releases is characterized by an explicit waterfall hand-off between teams – from Product (functional requirements) to Engineering (build and deploy). This old way did not explicitly promote release planning in the development process. The full release burden was shifted from one team to another without plans for a feedback loop or integrated release controls.  Hence, it was difficult for teams to continuously deliver software, gather feedback metrics, and take full control over software rollouts and rollbacks.

With the rise of continuous delivery, teams are integrating feature release and rollout controls into the product development process.  This is ushering in a new era of DevOps where teams collaborate on release strategies during the initial design phase in order to manage a release throughout the development cycle, from local, QA, staging, and to production.

How will the feature be released?  Who will be getting this feature first?  Will it be gradually rolled out? Who will alpha/beta test it?  What if something doesn’t go right?  All of these considerations (and more) are typically not discussed in initial planning meetings. By codifying a release strategy from the start, you can ensure that your software release matriculates smoothly from build to release to sunset.

Release Strategy

Release strategy is a DevOps concept where teams integrate feature release planning into the development process.  Instead of pushing a new feature to production and being done with it, developers integrate feature flags into the development lifecycle to control their releases.  Broadly speaking, feature flagging is a way to control the visibility and on/off state of a particular feature by wrapping the code in a conditional.  The process of flagging encourages developers to plan for the initial feature rollout, feature improvements based on user feedback, and overall feature adoption.  It basically compels teams to incorporate a release strategy into the development process.

Josh Chu, Director of Engineering at Upserve, explains how his team incorporates release strategy:

“A lot of times people would say ‘let’s roll this out’ and not think about how to actually get adoption. The fact that you have a feature flag story raises visibility. Engineers start thinking about how their feature is going to get rolled out rather than just ‘push it to production, it’s done’. The product team is encouraged to take a more hands-on approach to pre-release feedback, which directly feeds into obtaining post-rollout traction.”

 

LaunchDarkly Rollout and Release Strategy Feature Flags and Feature Toggles

Strategic Rollouts Using Feature Flags

In this section, we illustrate an example of how a release strategy can be incorporated into a traditional design, build, test, and release process.

Stage 1 – Feature Design

  • Design the feature’s functionality, examine the target audience and use case, and develop a timeline for implementation.

Stage 2 – Release Strategy

  • Collaborate on rollout and release strategy.  Should the feature flag be a front-end flag, back-end flag, or both? (Front-end flags are mainly used to control UI visibility, while back-end flags can control APIs, configurations, and even routing).  What should be behind the flag?  Is this a permanent or temporary flag? (ex. permanent = ‘maintenance mode’, temporary = ‘new signup form’) Who has rights to change the flag?  Is the flag intended to control a percentage rollout? How will we incorporate user feedback? How will we measure adoption and traction?

Stage 3 – Build

  • Develop and integrate, using the feature flag to manage the feature’s progression through multiple development environments.

Stage 4 – Test

  • Test the feature in QA and Staging, using the feature flag to control rollout and user targeting.

Stage 5 – Release

  • Deploy the feature as ‘off’ in production and then implement your release and rollout strategy.  This could be an incremental percentage rollout, individual user targeting, or targeting groups of users.

LaunchDarkly Rollout and Release Strategy Using a Feedback Loop Feature Flags and Feature Toggles

Feedback Loop

By adopting a release strategy, Engineering, Product, and Design teams can continuously collaborate on release and rollout plans, assessing user and performance feedback along the way.  Bryan Jowers, Product Manager at AppDirect, comments on the benefits of controlled rollouts, “It allows us to bring product to market faster, test, get data, and iterate… Engineering, Product, and leadership are all looking to deliver product with low risk… and we do that with tightly controlled rollouts.”

Instead of pushing a future to production and being done with it, teams should be designing release strategies that incorporate targeted/incremental rollouts with the intention of iterating based on feedback.  This creates a continuous feedback loop because teams can synthesize performance metrics and transform those metrics into better, faster iterations.

Take-Aways

Release strategy incorporates release and rollout planning into the development process.  It forces teams to plan for feature rollouts and develop methods for aggregating user feedback, analyzing metrics, and assessing traction.  One of the proven ways to do this, as used by teams at Upserve and AppDirect, is to use feature flags to launch, control, and measure features from development to release.

 

LAUNCHDARKLY HELPS YOU BUILD BETTER SOFTWARE FASTER BY HELPING MANAGE FEATURE FLAGS AT SCALE. START YOUR FREE TRIAL NOW.

 

08 Jul

Feature Branching Using Feature Flags

Feature Branching Using Feature Flags Github LaunchDarkly

Feature branching allows developers to effectively collaborate around a central code base by keeping all changes to a specific feature in its own branch.  With the addition of feature flags, feature branching becomes even more powerful and faster by separating feature release management from code deployment.

The Pains of Long-Lived Branching

Traditionally, new development has been done via branching.  These branches typically live for months at a time, with silos of developers attempting to work in parallel.  Long-lived branches often become the “I owe you”  of Gitflow: “I promise to address these merge conflicts… later.”  The pain points that Git was supposed to address — merge conflicts, missing dependencies, and duplicate code — gradually emerge the longer a branch lasts.  We then concentrate all the pain of reworks and conflict remedies until the end, which arguably makes the pain much worse.  To avoid these long-lived branching issues, branching with feature flags enables devs to effectively manage short-lived feature branches and continuously deliver better software.

Feature Branching

To help coordinate development, engineering teams have adopted distributed version control systems (like GitHub and BitBucket).  These systems allow developers to collaborate within a central codebase and use branches to make updates or develop new features.  Feature branching, therefore, has become a staple of modern development cycles because they allow developers to function without encroaching on each other’s progress.

One of the shortcomings of feature branching is that feature release management is still tied to code deployments.  In isolation, feature branching forces engineers to manage software delivery within the confines of their version control system.  Non-technical users can’t (and shouldn’t) manage feature visibility and distribution for customers from GitHub.  It is currently not possible for developers to turn features on or off in real-time, as a way to manage risk or gradually introduce a new feature.

Feature Branching Without Feature Flags LaunchDarkly

This diagram illustrates how developers create feature branches to manage feature development, ensuring that the current build is always in a deployable state.  While every company’s branching structure will be different, the basic premise is that you branch from a repository to make a new feature.

Feature Branching Using Feature Flags

This is where the introduction of feature flags makes feature branching exceptionally powerful.  Feature flagging allows developers to take full control of their feature lifecycles independent of code deployments.  Application features can be enabled or disabled without requiring code to be reverted or redeployed.

This process is called feature flag driven development, where continuous delivery teams manage feature rollouts separate from code deployments.

Feature Branching With Feature Flags LaunchDarkly

Benefits

Feature flagging allows developers to take full control of their feature lifecycles without depending on code deployments.  When you merge a feature branch into master (production), it is already wrapped in a feature flag.  This allows you to deploy the feature “off” and then gradually roll it out to users.  It also allows you to quickly “kill” the feature if it is not working well, without having to redeploy.  Other benefits include:

  • Better team communication
      • Developers continuously collaborate on new releases, coordinate efforts, and ensure that everyone is moving forward in tandem.
  • Logical deployments to prevent blocking
      • Deploy code in logical chunks without resorting to long-lived branching.  Even if a feature is not fully ready, it can be flagged ‘off’, but still deployed.
  • Exposed dependencies
      • Short-lived branching allows you to manage dependencies by removing the ones that are unnecessary.
  • Faster development
      • Spend less time addressing merge conflicts and refactoring old code.  Spend more time delivering features that users want.
  • Risk mitigation
      • A feature can be flagged throughout the entire development process, from local, QA, staging, and production.  This means that it can be toggled on or off in isolation, without impacting other development environments.
  • Manageable code reviews
    • Because you are merging more often, code reviews become less tedious and merge conflicts less onerous.

Better, Together

Modern DVCS combines the benefits of short-lived branches and feature flags.  Tools like GitHub and Bitbucket enable continuous delivery and short-lived branching, while feature flags help make branching faster, less risky, and more manageable.
Therefore, feature flagging does not replace branching, it is complementary and makes it more powerful. Feature branching provides the flexibility to decide when and what to release, while feature flagging lets you take full control of the release itself.  Together, branching and flagging help you maintain and integrate short-lived branches faster and with less risk.

26 Jun

Staging Servers are Dead! Long Live a Staging Server

Earlier this year, I wrote about why staging servers should die – that they actually increase risk and time and decrease quality. I’ve been very pleased at the thoughtful comments and feedback I’ve gotten about why effective continuous delivery and DevOps means no staging server. The one that made me the happiest was “Dreams exist to become reality. Here is one I’d like to achieve at work.”

Here I’ll address the feedback I received, and what’s standing in the way of achieving this dream.

 

First, there was a large agreement that “waterfall deployment” with a staging server added time and risk to the launch process. Rapportive founder Sam Stokes agreed, saying “Staging environments are an evolutionary dead end. They get out of sync, slow you down, just get neglected.” https://twitter.com/samstokes/status/690689667306369024 Christan Deger, Autoscout Software Architect  said “staging environment, even in a completely automated cloud setup, requires constant effort and produces costs.” Dave Nolan gave a talk at Pipeline London https://vimeo.com/162633462 on #nostaging.

 

Here are the questions and concerns about the practicality and validity of the “#noStaging” Dream.

 

“Don’t we need to test major infrastructure changes in Staging?”

I’m not advocating pushing untested code willy nilly into production. I am advocating for through and complete automation, continuous integration and feature flagging, with the goal to get as quickly as possible to production. The reason to kill staging servers was said very well by Joseph Rustcio, Librato CTO & co-founder  “You think of everything in advance of how a user will use a feature, and you still miss half of them.” The purpose of DevOps is to move as fast as possible from code on a developers box to customer.

 

One great case study is Librato. Librato uses feature flags to wrap features, deploying their code and then ramping volume up and down, controlling risk. The allure of a staging server is that all systems can be thoroughly tested there, “guaranteeing” a painless final deployment. However, the moment where you push code to the real world is the scariest, because the real world NEVER matches staging. By using feature flagging, Librato could de-risk a new infrastructure project. They used “Branch-by-abstraction” to ramp on and off a new infrastructure. The old way would have been to push the entire new infrastructure at a slow time like 3 am, then when something went wrong, scramble to fix, precisely when memories and tempers are short due to sleep deprivation and key people might be asleep. With feature flags, Librato could ramp up volume in the real world, with real data. Ruscio says of their rollout with feature flags, “We never got paged at night for an issue with the new system, as we were only introducing it during the day when we were available. So all issues could be addressed while we were available. With feature flags, we could even dial back risk during lunch hour.”

 

“But we need a staging server for contractual reasons”

Having a staging server for contractual reasons is arguing for an artificial artifact. As Dave Farley, co-author of Continuous Delivery” says “it’s not done until it’s delivered to users”. Lamont Lucas, FastRobot co-founder, says “media and advertising companies want final approval before a feature goes live, generally from non-technical approvers. Flagging all users coming from a gateway lets you fulfill the contractual obligation (of showing a feature to them for approval) while preventing you from having an entire legacy/pointless staging setup.”

 

“What about performance testing?”

Sean Byrnes, CEO/Founder of Outlier and the Founder of Flurry, often had to “test that new features work in a production-like environment and don’t compromise the integrity of our systems”, as Flurry had millions of users worldwide.  However, Byrnes  continued, “You don’t have to load test EVERY feature to failure”.

The failure of most features it’s lack of load as there’s no customer interest. However for feature where you do really need to know the failure rate, an alternate server other than production can be spun up ephemerally for that reason. Just don’t call it “the staging server” – it’s a production-like load system spun up for a specific reason – to test load – and then shut down after it’s purpose is complete.

 

“Feature flags are only good for shallow UE changes, not for Microservices”.

Actually, micro services make using staging servers even more painful as many versions of different microservices might be running across staging and production. Deger says,  “ In a microservices architecture with lot of independent deployments going on, using a single staging environment would inadvertently test integrations with services in different versions, than what is currently in production. This gives a false sense of security. So we decided to not have a staging environment at all and find different ways to release features with confidence while only integrating in production.

We are constantly learning new techniques to deploy into production without a staging environment. There are many different ways of integrations that we need to care of, but overall it makes us faster and the testing and release experience is better.””

 

“Harbaugh is biased”

I advise that instead of a confusing, time consuming, expensive, redundant and ultimately unsuccessful staging server; development teams should use feature flags to move features to production and do their feature validation as quickly as possible on production. I’m not unbiased – I’m CEO of a feature flagging management platform, LaunchDarkly. However, I founded LaunchDarkly the same reason I publish articles, podcast , and give talks on DevOps: because I care deeply about the power of feature flagging to create better quality software and reduce risk. You don’t need LaunchDarkly to feature flag – it’s easy to get started with a simple config file or open source library. Deger, is the contributor to an open source framework for feature flagging and huge proponent of no staging servers.

 

“Let’s talk when you want to do serious software development”.

Just as waterfall development once seemed the only way to guarantee success, so did waterfall deployment until recently. The most innovative and fast moving companies like Netflix, Google, and Amazon have found that they can move faster if they’re more agile. And serious? I think of software development as fun, as exciting, as liberating, transforming – with software people are connected worldwide, interacting in incredible ways, and their lives are better. I hope software never becomes too serious for me.