26 Jan 2017

Launched: Flag Tagging Management

LaunchDarkly Feature Flag / Feature Toggle Targeting and Management

For better feature flag management, LaunchDarkly allows you to create tags for organizing and grouping your feature flags.  Adding tags (like “Front-End”, “Ops”, “Marketing”, “Restricted”) helps you categorize flags and manage custom permissions.

Here, we have added the tags “mobile”, “marketing”, and “unrestricted” on the Settings tab of our feature flag:

After adding tags, you can filter by tag on the dashboard and link to filters for better feature flag management.

Here, we have clicked on the “marketing” tag, which has created a filter that shows all feature flags tagged “marketing.”

LaunchDarkly Feature Flag Tagging and Management Dashboard

Creating a filter also generates a URL that you can bookmark and share with your teammates.  For example, this URL will show all feature flags tagged “marketing”  https://app.launchdarkly.com/default/production/features?tag=marketing .

In the future, we will add more advanced filtering and sorting for even better flag management.  If you have any suggestions or questions, please feel free to contact us at support@launchdarkly.com

04 Jan 2017

How Feature Flagging Helps Usability Tests

Usability testing in a real-world environment (aka production environment) gives us insight into how users actually use our product in their day to day lives. It is one thing to run a test in a lab setting, but it is another to have users try features while they are walking, running to the airport, stressed, and sleepy.

But, no matter how hard we try, it is very difficult to truly simulate how our apps behave in our production environment. We can run focus groups, beta tests on a beta environment, and test things internally, but how can we truly mimic the real world in an artificial context? In other words, how do we test a feature while also simulating the environment it’s meant to be used in?

LaunchDarkly Usability Testing In Production - Feature Flags and Feature Toggles - Context

A good real-world usability test analyzes features efficiently and accurately collects user feedback to improve the user experience. But, when we test anything in a non-production environment, we are inherently biasing our tests. In a lab-based usability test:

  1. Users are overly cognizant of the feature they are testing.
  2. Users are unnaturally focused on testing that new feature.
  3. Users are using fake data or incomplete data, and don’t sufficiently utilize actual use cases.
  4. It is very hard to test discoverability (i.e can a user find the feature on their own?).
  5. Users tend to ignore distractions, like external notifications (Facebook, Skype, texts) and just focus on the task at hand.
  6. Users are in an overly analytical mode, typically looking to give feedback.
  7. Users are overly forgiving, looking to please!

Does this mean that running usability tests in non-production environments is a waste of time? Not at all! This is absolutely necessary to identify bugs, check for general usability, and solicit feedback quickly.

However, it should just be one of the steps in a comprehensive usability testing process, one that involves internal, staging, and production usability testing.

Benefits of usability testing in production:

The primary purpose of usability testing in production is to gather real-world user behavior while minimizing bias and performance risk. Some more benefits include:

  1. Genuine user feedback in a real-world environment
    • Quantitative insight into your feature’s performance. How well is it scaling? Are users using it? How is it impacting your system? How are your levels of engagement?
    • Contextual insight into your feature’s efficacy. Do users see the new feature? How is it meshing within the context on your existing feature set? Are people using it as intended? Are people using it once and then not using it again?
    • Qualitative feedback. Are users complaining? Are they happy? Are they neutral?
  2. No opt-in bias – users test the feature without knowing they are part of the test. You can assess how well they use it by using a product like Full Story to record the session or by tracking metrics. You therefore get a more representative sample testing the feature, rather than just early adopters.
  3. Measuring actual system performance – there is nothing quite like your production environment, where you can have a complex array of nodes, clusters, CDNs, etc allowing your app to scale. As people start to use the new feature, you can see how it is impacting your actual system (load times, discoverability, caching issues).

Managing a usability test in production:

Of course, testing anything in your production environment is inherently risky and has real-world consequences. If you launch something to everyone just to get feedback and they hate it, then you risk permanently losing those users. Equally bad, you can cripple your entire application with unforeseen scaling and performance issues.

To mitigate this, companies like Facebook, Amazon, and Google collect production feedback by releasing features behind feature flags. While we won’t go into to the specific anatomy of a feature flag, we can go through the methodology behind the release.

LaunchDarkly - Usability Testing Using Feature Flags and Toggles - Betas and Feedback

If a feature is wrapped in a feature flag, then it gives you control over who sees the feature and when. This means that you can perform targeted and controlled releases using a percentage rollout, whereby you can incrementally increase a feature’s visibility to 1%, 5%, and to 100% of your users.

Hence, you can collect production level feedback because you control the level of risk. If a new feature is performing well, then you can keep increasing the percentage rollout. If it is tanking or hurting performance, you can reduce the rollout or kill it completely.

Therefore, feature flags (aka feature toggles) give you full control over the risk of your production releases. You can gather real-world user feedback by separating your feature rollout from your code deployment.

22 Dec 2016

Beta Testing using Feature Flags

LaunchDarkly Feature Toggle Beta

It’s best practice for products to have some sort of beta – a way to collect customer feedback and test performance before releasing to everyone. In an era of continuous delivery, we are delivering new features and experiences more frequently and with less time to gather thorough customer and performance feedback. With this increased cadence, product teams are having to make betas shorter, forego them altogether, or slow down their release cadence to gather adequate customer feedback.

Challenges of traditional betas:

  • Coordinating Opt-Ins: It sometimes takes weeks or months to gather customer opt-ins to test new betas. You also have to organize the distribution of beta keys (ex. for early access to games) and reminder emails.
  • Organizing Focus Groups: Getting feedback from focus groups is often time consuming and expensive, creating a long feedback loop that lengthens the release process.
  • Opt-Out: If customers opt-in to a beta and don’t like the experience, then they will want a simple mechanism to switch back to the production version.
  • Granular Betas: It is very difficult to do targeted betas based on user attributes or to perform incremental percentage rollouts of new beta features.

Feature toggles

To overcome these challenges, smart product teams are beginning to run betas with feature flags/toggles. These are mechanisms for granularly controlling software releases, allowing you to control the timing and visibility of a beta release.

Currently, many betas are tied to code releases and are managed by a config file or database.  This approach requires engineering time or custom mechanisms to opt-in users.

With feature toggles, you can empower product, marketing, sales, and even customers (themselves) to opt-in new to a new beta experience.

Feature Toggle Beta Test LaunchDarkly

In this simple example, you can use a toggle to control the visibility of a new beta feature. Ideally, this toggle would be part of a user interface that could be controlled by a non-technical team member. The code, itself, could be deployed off and then turned on via the toggle.

Beta Test Percentage Rollout with Feature Toggle LaunchDarkly

Moreover, you can also use the toggle to control the percentage of users who get the beta experience. For instance, you could release the new beta experience and have it rolled out to 0% of users. You could gradually increase the rollout percentage from 1% to 5% to 20% and more, collecting customer and performance feedback along the way.

Surfacing this beta control functionality in a user interface is critical for giving non-technical team members access to release controls.

Regional betas

For a recent example of a targeted rollout, we can look at how Pokémon GO released their product country by country: first to the United States and then abroad.

This is a great use case for feature toggles because you can create targeting rules to determine which users receive the feature first. For example, I could create a toggle that is governed by the rule: “If users live in San Francisco, then serve the new Nearby Pokémon feature”. This allows you to maintain different regional feature sets without having to deploy different versions of the application. It also allowed Pokémon GO to refine their algorithms and assess customer feedback before rolling out the new feature to a wider audience.

Benefits of beta testing with feature toggles

  • Empowered non-technical users: Allow the sales, marketing, product, design, and business teams to turn features on for specific users, collect feedback, and control the business logic. This also substantially cuts down on engineering time.
  • Production feedback for your beta tests: Test features in production with limited user segments to collect customer and performance feedback.
  • Incremental percentage rollouts: Gradually roll out features to incrementally test performance and mitigate risk. If the feature is bad, toggle it off.
  • Real-time opt-in / opt-out: Allow users to opt in and out of beta tests in real time, controlled via the feature toggle. Skylight provides a nice article on this.

Getting started with toggles

Conceptually, a feature toggle is relatively simple. You create a conditional in your code that controls the visibility of a code snippet. There are many open source libraries that will allow you to get started.  However, these libraries become cumbersome when you try to feature toggle at scale or restrict access to particular toggles. Depending on your needs, you could consider a feature toggle management platform to provide a system for access control and mitigating technical debt.

06 Dec 2016

Feature Flag-Driven Products

LaunchDarkly Consistent Mobile and Web Experiences using Feature Flags / Toggles

Using feature flags for plan management, personalization, and cross-platform consistency

When feature flags/toggles were first introduced, their primary purpose was to mitigate risk and manage software releases.  We would use these toggles to turn features “on” or “off”, and gradually roll out new features to customers.  If something went wrong or if customers didn’t like the new feature, we could kill it without redeploying.

What developers started realizing is that we could use feature flags to manage dynamic content and have long-term control over every application feature.  This means that toggles would not just be “on” or “off”, “true” or “false.”  Rather, they could serve strings, numbers, JSON objects, and JSON arrays — allowing us to serve dynamic content and use feature flags to perform more interesting functions.

 

Plan Management

Almost every product has a series of plans made of bundled features.  Plans typically have two parameters: a feature and a value.  A feature can be something like a VIP Area, enhanced support, or additional customizations.  Values are attributes of features, and can be something like “20” members, “5” teams, and “enhanced” speed.

Using feature flags, we can easily determine which features belong to which plans, and edit bundles as we add new features or change their values.

Here, we have two plans: Standard and Gold.  The Standard plan allows for 10 members, while the Gold plan allows for 50 team members and a VIP area.

LaunchDarkly Plan Management with Feature Flags and Toggles

We constructed two flags: team_count and vip_area.   For team_count, we created a multivariate feature flag that returns the numbers 50, 10, and 1.  We then have a rule that checks if the “plan” attribute matches “standard” or “gold” to determine which value to serve.  The “plan” attribute would belong to a user object, which could look something like this:

Because Ernestina’s plan attribute has a value of “gold”, she would receive a 50 team member limit and have access to the VIP area.

Moreover, we could use multivariate flags to dynamically modify pricing.  For example, we could toggle prices on-load based on whether the the user is a “VIP” or “NEW”.

LaunchDarkly Pricing Management with Feature Flags / Toggles

In this example, all “new” users receive a discounted price of $50 and “VIP” users see a discounted price of $20, while all other users receive the normal $75 price.

Other pricing use cases would be localization, seasonality, and discount codes.  We could manage these separate from our application logic, giving us more flexibility without having to redeploy.

 

Personalization & Styling

We can also use feature flags to manage styling and personalized themes, essentially using these toggles as a quasi-content management system.

Here, we use a multivariate flag to serve color hex values, turning a feature orange, blue, or green:

LaunchDarkly Multivariate Feature Flags / Feature Toggles for dynamic application managementWe could also serve pre-defined CSS class names to add or remove properties from elements.  This allows us to separate some of this logic from our primary codebase.  While this shouldn’t be used for primary layout and styling, it could be used to serve particular to styles or to customize themes based on user attributes.

For example, if I had a user with a ‘theme’ attribute that has a value of ‘blue’, then I could serve the blue CSS class, which my application logic would append to the relevant elements.

 

Consistent Cross-Platform Experiences

Cross-platform feature flagging is an easy way to deliver personalized and consistent user experiences across different platforms.  If our product has features that are shared on a web and mobile platforms, we can use one flag to control the visibility of both those features.

Here, we have a user who is receiving the TRUE variation for the VIP feature flag.

LaunchDarkly Consistent Mobile and Web Experiences using Feature Flags / Toggles

Even though the web and mobile platforms have different permutations of the feature, the same flag could synchronize the VIP experience between both platforms.

Some more benefits of cross-platform feature flagging include:

  • The ability to decide whether to release a cross-platform feature simultaneously or separately, with full control over who gets to see that feature. For example, web users might get access to a new search bar before mobile users do.
  • Real time personalization that allows users to opt-in to new features on one platform (like mobile) and have that personalization sync with another platform (like web)
  • Percentage rollouts that allow us to gradually release a feature to targeted users on different platforms, allowing us to assess user and performance feedback for each platform
  • A kill switch that lets us turn off poorly performing features for web and mobile, without having to redeploy

 

Flag Management

The process of feature flagging is fairly straightforward: we wrap our features in conditionals that determine who can see the features and when. At an enterprise scale, organizations must confront the complexities of mitigating technical debt, managing developer workflows, compliance, and controlling the lifecycle of feature flags.

One of the biggest pain points around traditional feature toggles is that marketing or business teams would need to create engineering tickets to get their requests updated.  These toggles would be typically controlled within the codebase itself, like a config file or database.

By adding a user interface for flag management, we can empower non-technical teammates with the ability to target users, perform rollouts, or run beta tests.  This even makes it easy for engineers to coordinate flag creation, mitigate technical debt, and manage the lifecycle of their flags.

LaunchDarkly Feature Flag / Toggle Management Dashboard

Some other benefits of UI flag management:

  • Flag Implementation & Consistency – Feature flags need to be carefully crafted and consistently implemented within our application to prevent performance degradation and ensure that they function correctly. Often times, organizations will use conflicting methods like managing multiple config files or using in-line toggling.
  • Flag Scalability – Adding more feature flags makes testing and management exponentially more difficult over time. It becomes very hard to tell which flags are necessary or obsolete.
  • Flag Management – Maintaining feature flags across multiple development environments (local, QA, staging, production) becomes arduous and time-consuming. It becomes increasingly difficult to track who created the flag, the flag’s intended use, and changes made to the flag’s rollout.
29 Nov 2016

Toggle Talk with Damian Brady

I sat down with Damian Brady, Solution Architect at Octopus Deploy for a conversation about his experience with feature toggles.  He shared with me his tips for best practices, philosophies on when to flag and what he thinks the future of feature flagging will look like. 

  • How long have you been feature flagging?

I had to think about this one a bit – about 8 years ago but I probably didn’t know what it was called at the time.

“It’s definitely the case that people are doing this without knowing the name “feature flag” or even giving it a name. They’re just saying it’s a configuration switch or a toggle and but not giving it a more proper name, they’re not identifying it as a first-class citizen really.”   

  • What do you prefer to call it and why?

Now I call it feature flagging or occasionally feature toggles. I think toggles makes a bit more sense as analogy for non-technical people.  

  • When do you think feature flagging is most useful?

There’s a couple – but the one I think it’s most useful for is to use a feature flag when you have a feature that is nearly complete or complete from your point of view. Either way, you are ready to get verification from someone with real data.

“You can test as much as you want with your pretend fake data, or even a dump from production which is being obfuscated, but until it gets used in the wild you’re never really sure that the feature is doing exactly what it needs to do.”  

So hiding that behind a feature flag, and then clicking it on for somebody who is using the product for real in any way gives you that last little test that is ultimately the most useful.  At that point you still have the opportunity to back out. If something was corrupt or your expectations were wrong, it’s really useful for that last-minute check.  

At Octopus, we’ve started using feature flags for big features that a lot of people don’t want to see. So a while ago we introduced the idea of a multi-tenant deployment. And probably most of our users don’t need that feature because it adds a lot of complexity to the UI.  We have a configuration section where you can toggle an “on” and “off” switch, so if you don’t need that feature you can just leave it off.     

Are there any cases where feature flagging is not a good idea?

I think there are two extremes where feature flagging is not a good idea. On one hand, flagging really small changes can be more trouble than it’s worth. It’s introducing an extra level of complexity that maybe for a small change is not critical.  

On the other side, using feature toggles around the architectural changes in the core of your application – that’s kind of hard to test. Do you have a feature flag that when you turn it “on” it completely redirects the way the entire application will run? In that case you bite the bullet and decide that this is a big change and you’re just going to have to test it very thoroughly and not give yourself a way out.

That being said, there are some cases where you still need to give yourself a way out by using a flag. For example, you might deploy some new feature thinking it’s correct, but subsequently learn from a customer or user that it doesn’t really meet their needs. Rather than the user living with a bad feature, you might want to turn the flag off and go back to the drawing board.

If it’s an architectural change, you may only find out that there’s a bug when you use it in production. Test data may not surface the issue properly.

Ultimately, doing core architecture changes in a way where you can back out later can be an extra huge amount of work. It’s probably at that point you know you aren’t going to do it (revert back) anyway.   

  • Best use of a feature flag – a personal story?

When I first started using feature flags, around the 8 years ago timeframe, I was working on a web application that was internal and a big line of business.  And we had just added a new third party provider for providing SMS.  And with this new provider, it meant we had to write a lot of new code.  It was internet banking software so it was a one-time password we were sending out – and it was really really important that it work.

We tested everything rigorously but wanted more insurance.  So we put the new service behind a feature flag. We had a bunch of agents that ran this type of SMS. We enabled a flag for one of the agents and monitored it to make sure it was actually doing the right thing and not failing. And then we started trying other ones. It failed a couple of times because of differences in the sandbox environment between the third-party provider and the real one.

“We thought everything was okay, but when we put it live we turned it on slowly, and it didn’t do what we expected.”

So when that happened we turned it back off again…and went back to the drawing board.

So without the feature flag, we would have dropped every person using the service at that critical point. That client would have not been able to receive SMS’s until we were able to rollback.

  • What do you think is the number one mistake that’s made around feature flagging?

There is one that I keep seeing – when you wrap a new feature you believe to be finished in a flag, the biggest mistake with this is not testing that change with the flag “on” and then “off”. For instance, when you turn it “on” it snaps into new database tables or starts changing the way data is saved. But when you turn it “off” again, you’ve lost that data or data is corrupt. For this you need to test it “on” and “off”.  

“If you have more than one feature flag running at the same time, test the combinations of them being both ‘on’ and ‘off’.

If they’re likely to interact with each other you need to test “one on, one off,” “both on,” “both off” and all possible combinations like this.  

  • How do you think feature flags play into the DevOps movement? What about Continuous Delivery?

I think feature flags play in both continuous delivery and continuous deployment. I think they’re most useful to continuous deployment. You have all of your features pushed out to production as soon as they compile essentially – but they are behind a feature flag so you don’t break anything. That’s the way Facebook does it. They know that any new code they write might end up in production so it’s going to be safe behind a feature flag.  

“The design of the DevOps movement, the aim of it really, is to get real features and real value in the hands of users as quickly as possible.”

So if you have to wait until this “half done piece of work” is actually safe to deploy then that slows you down. So having it behind a feature flag so that it doesn’t get touched until you are ready to test it can be really powerful for increasing velocity and getting things out to production much faster.     

So even for marketing teams, it means they don’t have to tell the developers “hey we worked out the result of this a/b test and we want option b.” If the marketing department can just just flip that switch and say “no, option b is working better so just leave it there” without a new deployment or contacting the developers to remove the old stuff and redirect to the new stuff,  that increases that team effort of getting value to customers which is the whole purpose of DevOps.

  • Can you share any tips for better flagging?

If you’re feature flagging a big change, pair the feature flag with a branch by abstraction pattern. See the clip from my talk from NDC Sydney for more details.

There’s also the concept of transitional deployments – again refer to my video clip here for more. It’s useful for things like database schema changes where you have a midpoint for both the new and old applications that will work with the schema that’s currently there. So you can turn that feature off if you need to.

  • Are you seeing feature flagging evolving? If so how?  And how do you expect it to change in the future?

It’s been around for a long time…but I think it’s becoming much more visible – and partly it’s LaunchDarkly helping with that. I think more people will start using feature flags in their continuous delivery pipeline. And the more continuous delivery becomes mainstream, the more mainstream developers will need feature flags.  

“I think feature flagging is starting to be something that you have to add your deployment cycle because you know it needs to be fast and you know you feature needs to get to production as quickly as possible – and feature flags are the way to do that.”  

So as it becomes more mainstream I think there will be more tools, more frameworks, more awareness of it (feature flags) as it hits more and more companies. I think there will be things coming out like feature flag-aware testing tools – so testing tools that know that they need to test with this flag on and off.  

The summary – more tools around best practices around this thing which is becoming more mainstream.  With DevOps becoming more popular, more people are thinking “yes we need to get to production quicker, we need that cycle time to reduce” so it’s a natural extension I think to start solving some of those problems with feature flags.  

“I think it’s just starting to become more mainstream frankly because it’s a solution to a problem that is starting to become more mainstream.”   

22 Nov 2016

Soft Launches Using Feature Flags

Feature Flag / Toggle Soft Launch

Getting granular, real-time control over your feature releases

Let’s imagine your team wants to launch a new one-click checkout feature.  You’ve been working on it for months: designing, iterating, and developing.  Now, you’re ready to release it into the wild.  

In the old days, we would just release the new feature via a code deployment.  It would be live for everyone. If something crashed, we would need to do a code deployment to roll back the changes.  Since everything would be live all at once, every customer would feel the pain.  Even if the feature didn’t break, it would take days or weeks to truly measure customer satisfaction.  Was the new feature increasing sales?  Are people enjoying it? Is it hurting engagement?

A feature flag

A feature flagged soft launch changes this. It allows you to mitigate the risk of feature releases and incrementally roll out a feature to your users.

Some benefits include:

  • Releasing a feature ‘off’ in production and then slowly rolling it out
  • Allowing only particular users to see the feature
  • Performing randomized percentage rollouts that target small segments of users
  • Killing a feature instantly without redeploying
  • Adjusting the rollout percentage to test infrastructure performance and scalability
  • Syncing PR and advertising with feature releases

How it works

By definition, a soft launch is a way to release a new product or service to a segment of your audience in advance of a full launch.  Traditionally, we would think of these as alpha or beta opt-in releases.  These would be managed at the database or configuration level, where you would specify users who would receive the beta feature in a very manual way.

Feature Flag / Toggle Soft Launch

With feature flags, you can practice intelligent soft launches.  A feature flag is a way to control the progression of a feature throughout its lifecycle, from design and development to release and rollback.  It also allows you to granularly target segments of your users for beta tests and incremental percentage rollouts.

Soft launching with a feature flag

Let’s use our one-click checkout feature from above as an example.  The goal of this new feature is to make it easier for our customers to purchase an item by removing the friction of multiple checkout steps.  However, we cannot truly know how our customers will enjoy this feature until they have tried it.

Here are some of our release uncertainties that cannot just be tested in a staging environment:

  • Can will it handle a production-level load?
  • How will it impact support costs?
  • Does it work well on all versions of all browsers?
  • What edge cases will create stability, security, and performance issues?
  • Will customers actually like it?

Of course, we will have tested this feature on our development environments and performed a closed beta.  But, these modes of testing cannot simulate our fully-scaled production environment that encompasses a whole range of users, from our early adopters to our laggards.

So, what we’ll do is wrap our one-click checkout in a flag:

Anatomy of a feature flag for a soft launch

Basically, we can pass a series of user attributes to a flag, like “name, email, age, group, beta”, and then the flag’s targeting rules determine whether the feature is on or off for that user.

Here, we have rolled out our one-click checkout feature to 20% of users, while 80% still receive the old checkout:

Feature Toggle Roll Out for Soft Launch

Using a slider, we can easily control a feature’s release in real-time without having to redeploy.  The percentage can be increased or decreased depending on the feature’s performance.

Soft launching best practices

For an introduction to the benefits of soft launches, Mobilize provides a nice overview using the example of a mobile game launch. You should also reference this guide to feature flagging best practices to learn how to incorporate feature flagging into your soft launch.