08 Aug 2017

Flexible Infrastructure with Continuous Integration and Feature Flagging

flexible infrastructure

I’m incredibly excited to be LaunchDarkly’s first solutions engineer. During my first week I got to learn about some of the clever ways we do feature management. Not only do we use feature flags to control the release of fixes and new features, but we also use them to manage the health of our infrastructure in production. I’ve been a part of a number of teams, and I’ve never seen a more advanced development pipeline.

Normally, dealing with issues in production can be a frightening and time-consuming experience, but adopting a mature continuous delivery pipeline can allow you to react faster and be proactive. Continuously integrating and deploying makes getting fixes into your production code a trivial task, but using feature flagging takes it to the next level and lets you put fixes in place for potential future pain points that you can easily enable without having to do another deploy.

One common problem is handling extreme server load. This is managed easily with LaunchDarkly. Imagine you have a server that is pulling time-sensitive jobs from a queue, but the queue fills faster than the server can handle it, and causes all jobs to fail. In situations like these it would be better to at least get some of the jobs done, instead of none of them. This is concept is known as “bend-don’t-break”.

I built a proof of concept using Python and rabbitMQ which demonstrates how you could use LaunchDarkly’s dashboard to control what percentage of jobs get done, and the rest get thrown away. If the worker takes too long to get to the job, the job fails. As you see the queue grow you can manage it easily with feature flags.

It consists of two scripts, taskQueuer and taskWorker. The taskQueuer adds imaginary time-sensitive jobs to the queue; the rate is configurable using feature flags.

The taskWorker removes one job from the queue and processes it. One job takes one second. If tasks are queued faster than the worker can process it, the queue fills up and the worker will begin failing. To protect against this, you can use the “skip rate” feature flag to allow the worker to drop a certain percentage of jobs on the floor.

The concept of using LaunchDarkly as a control panel to manage the operation of your app is really cool and opens up a world of possibilities beyond simply percentage rollouts and canary releases. If you have an interesting implementation, get in touch with us at hello@launchdarkly.com and maybe we’ll feature it!

More about tough devops: https://insights.sei.cmu.edu/devops/2015/04/build-devops-tough.html
More about using Python with rabbitMQ: https://www.rabbitmq.com/tutorials/tutorial-one-python.html
More about LaunchDarkly’s stack: https://stackshare.io/launchdarkly/how-launchdarkly-serves-over-4-billion-feature-flags-daily

03 May 2017

Integrating Feature Flags in Angular v4

A little while ago, we blogged about eliminating risk in your AngularJS applications by leveraging feature flags. Like all good web frameworks Angular continues to release new versions providing opportunities to tweak and update your code. The benefits of Angular over its predecessor include a built-in compiler,type enforcement, and a complete re-write in Typescript. All valuable of updates for reducing agony within the software development lifecycle.

If you’re thinking of making the switch to Angular, or are already using it, LaunchDarkly is here to help you eliminate the risk all the way from your initial migration to future successful launches. In this article, we’ll discuss how to eliminate risk and deliver value in your Angular project.

We’ll build on Tour of Heroes (which we’ll refer to as TOH from here on out), a demonstrative Angular app which showcases the framework’s basic concepts. Essentially, TOH is a live roster of superheroes, including search functionality and the ability to modify hero details. To learn more about TOH, and to get familiar with Angular, check out the official tutorial.

Creating our Feature Flags
Suppose we want to limit the usage of our search and modify features to a certain subset of our users. To achieve this, we’ll create two feature flags, toh-search  and toh-modify . In our case, we’ll allow logged in users access to search, and only the administrator will be able modify heroes.

An implementation of toh-search in the LaunchDarkly console

Integrating

Now, we’ll create a service which handles everything LaunchDarkly’s JavaScript SDK will throw at us. Note: for simplicity, we use a dummy user-switching feature (located in the user component of the project folder).

LaunchDarklyService’s constructor starts by initializing the SDK, and follows up by calling the built-in on method, which will update the feature flag values within our app whenever the user is changed, or the feature flag configurations are modified. This is handled by a Subject-typed variable,   flagChange , which will later be subscribed to by in the app’s components.
With our service functional, we’re now able inject it as a provider into TOH’s “search” and “hero” components, granting them full access to our feature flags!

In the hero-search component, we subscribe to the aforementioned flagChange , which will let Angular know that the search component should be toggled whenever the respective feature flag configuration is changed. The hero component is modified in a similar fashion to introduce the toh-modify  flag.

See it in action!

Search:

Modify:

Be sure to check out the complete project on GitHub, we’d love to see what other features you can build into Tour of Heroes!

12 Apr 2017

How Spinnaker and Feature Flags Together Power DevOps

It’s very common for customers to be excited about both Spinnaker (continuous delivery platform) as well as feature flags. But wait? Aren’t they both continuous delivery platforms? Yes, they are both trying to solve the same pain points – the ability to quickly get code in a repeatable, non-breaking fashion, from the hands of the developers into the arms of hopefully excited end users, with a minimal amount of pain and heartache for everyone along the toolchain. But they solve different pain points:

  • Spinnaker helps you deploy functionality to clusters of machines.
  • Feature Flags help you connect those functionality to clusters of USERS.  

Spinnaker helps with “cluster management and deployment management”. With Spinnaker, it is possible to push out code changes rapidly, sometimes hundreds (if not thousands) of times a day. As Keanu Reeves would say “Whoa.” That’s great! All code is live in production! Spinnaker even has handy tools to run black/red deployments where traffic can be shunted from cluster to cluster based on benchmarks. Dude! For those who remember the “Release to Manufacturing” days where binaries had to be put on an FTP server (and hope that someone would install and download in the next quarter or so), code being live within a few minutes of being written is amazing. For those who remember “master disks” and packaged software, this is even more amazing.

Nevertheless, with dazzling speed comes another set of problems. All code can be pushed anytime. However, many times you do not want everyone to have access to the code – you want to run a canary release on actual users, not just machines. You might want QA to try your code in production, instead of a test server with partial data. If you’re a SaaS product, you might want your best customers to get access first to get their feedback. For call center software, you want to have an opportunity to test in a few call centers. You might want to have a marketing push in a certain country days (or weeks or months) after another country. You might want to fine-tune the feature with some power users, or see how new users react to a complicated use case. All of these scenarios can not be done at a server level. This is where feature flags come in. By feature flagging, you can gate off a code path, deploy using Spinnaker, and then use a feature flag to control actual access.

Together, Spinnaker and feature flagging make an amazing combination. You can quickly get code to “production”, and from there decide who gets it, when.

14 Mar 2017

My Agile Launch

Starting at a company that helps software teams release faster with less risk has reminded me of my first foray into agile development.

One of my earliest professional experiences was as an intern at HBO, where I reported directly to the VP of Emerging Technologies.  Over the course of the summer, it became clear that there were more promising new technologies to explore than there were engineers.  The undaunted college student that I was, I convinced my boss that I should own one of these projects.  Every day I would demo my work on a screen in his office, and he would provide feedback.  A few weeks later the VP presented to senior management and the company officially green-lit the project.  Though we didn’t refer to it as such, that cycle was my introduction to Agile.  Yet more importantly, it was a daily routine where a non-technical business stakeholder provided direct feedback to a technical resource; an arrangement that I’ve come to realize is quite uncommon.

Working at large and small companies in a variety of engineering roles, there are two overall trends that I’ve identified:

  1. Companies often separate engineering from the business side.
  2. Most development-related failures (e.g. missed deadline, bad or buggy feature, release causing a security vulnerability, etc.) are a result of miscommunication.

Neither of these statements are profound in isolation but when coupled genuinely pique my curiosity.

Over time the broader engineering community has developed numerous tools and processes to mitigate risk.  Missing deadlines?  Use story points to measure team output (velocity) over time.  Releasing buggy features?  Try test-driven development.  Want to avoid downtime during a release?  Setup application performance monitoring in your staging environment.  While these “quality assurance” measures are not guarantees of perfection, they make development more predictable.

Problems arise when we cut corners in response to misaligned expectations.  Let’s say there’s a feature request for a relatively straightforward user enhancement.  The development team has done everything right to this point (features properly designed and scoped), but towards the end of the sprint the team finds a scalability issue with no clear path forward.  Engineering management notifies the business side of this issue and insists that there should be resolution within five business days.  Five days pass and the developers have made no progress.  Engineering rushes to fix the issue through a refactor but skips unit testing to release earlier.  Two new bugs slip into production.  Instead of working on the next sprint the development team now works to get a hotfix release out.  One unexpected event can change everything.

The separation of the business side from its engineering counterpart sets the stage for frustration and missed opportunities.  Any tool that can bridge the gap between these two groups offers immense value to an organization or a working relationship.  Cucumber, a Behavior-Driven Development test tool, empowers non-technical stakeholders to define requirements, in plain English, that double as executable test guidelines for engineers.  By regularly reviewing Cucumber test results, a business stakeholder could easily assess the current status of a project.  Nevertheless, Cucumber facilitates one-way communication and offers no clear guidance on iteration or state.

The daily iterations I had at HBO were extremely effective in moving the project forward quickly. In a perfect world we could repeat this process as often as possible with senior management to win their approval as early as possible.  What if we schedule a daily meeting with the entire senior management team and show up with a buggy build?  It would quickly become apparent that our good intentions are far less valuable than executive time.  Instead, what if we gave each one of the senior managers a version of the technology that we could push updates to over time?  What if I could focus on building features and my boss could choose and deploy the ones that he thought were ready?  What if after realizing that there was a problem with a deployed feature he could hide that feature without involving me?  LaunchDarkly does all of this at the enterprise level.

To me LaunchDarkly is about much more than feature flagging or even quality assurance; the platform empowers companies to reconnect the technical and non-technical departments in order to shorten feedback cycles with customers and make better decisions.

This mission is a game changer.

That its founders and team are all extraordinary yet humble makes the opportunity twice as appealing.

22 Dec 2016

A New Way To Beta Test

LaunchDarkly Feature Toggle Beta

It’s best practice for products to have some sort of beta – a way to collect customer feedback and test performance before releasing to everyone. In an era of continuous delivery, we are delivering new features and experiences more frequently and with less time to gather thorough customer and performance feedback. With this increased cadence, product teams are having to make betas shorter, forego them altogether, or slow down their release cadence to gather adequate customer feedback.

Challenges of traditional betas:

  • Coordinating Opt-Ins: It sometimes takes weeks or months to gather customer opt-ins to test new betas. You also have to organize the distribution of beta keys (ex. for early access to games) and reminder emails.
  • Organizing Focus Groups: Getting feedback from focus groups is often time consuming and expensive, creating a long feedback loop that lengthens the release process.
  • Opt-Out: If customers opt-in to a beta and don’t like the experience, then they will want a simple mechanism to switch back to the production version.
  • Granular Betas: It is very difficult to do targeted betas based on user attributes or to perform incremental percentage rollouts of new beta features.

Feature toggles

To overcome these challenges, smart product teams are beginning to run betas with feature flags/toggles. These are mechanisms for granularly controlling software releases, allowing you to control the timing and visibility of a beta release.

Currently, many betas are tied to code releases and are managed by a config file or database.  This approach requires engineering time or custom mechanisms to opt-in users.

With feature toggles, you can empower product, marketing, sales, and even customers (themselves) to opt-in new to a new beta experience.

Feature Toggle Beta Test LaunchDarkly

In this simple example, you can use a toggle to control the visibility of a new beta feature. Ideally, this toggle would be part of a user interface that could be controlled by a non-technical team member. The code, itself, could be deployed off and then turned on via the toggle.

Beta Test Percentage Rollout with Feature Toggle LaunchDarkly

Moreover, you can also use the toggle to control the percentage of users who get the beta experience. For instance, you could release the new beta experience and have it rolled out to 0% of users. You could gradually increase the rollout percentage from 1% to 5% to 20% and more, collecting customer and performance feedback along the way.

Surfacing this beta control functionality in a user interface is critical for giving non-technical team members access to release controls.

Regional betas

For a recent example of a targeted rollout, we can look at how Pokémon GO released their product country by country: first to the United States and then abroad.

This is a great use case for feature toggles because you can create targeting rules to determine which users receive the feature first. For example, I could create a toggle that is governed by the rule: “If users live in San Francisco, then serve the new Nearby Pokémon feature”. This allows you to maintain different regional feature sets without having to deploy different versions of the application. It also allowed Pokémon GO to refine their algorithms and assess customer feedback before rolling out the new feature to a wider audience.

Benefits of beta testing with feature toggles

  • Empowered non-technical users: Allow the sales, marketing, product, design, and business teams to turn features on for specific users, collect feedback, and control the business logic. This also substantially cuts down on engineering time.
  • Production feedback for your beta tests: Test features in production with limited user segments to collect customer and performance feedback.
  • Incremental percentage rollouts: Gradually roll out features to incrementally test performance and mitigate risk. If the feature is bad, toggle it off.
  • Real-time opt-in / opt-out: Allow users to opt in and out of beta tests in real time, controlled via the feature toggle. Skylight provides a nice article on this.

Getting started with toggles

Conceptually, a feature toggle is relatively simple. You create a conditional in your code that controls the visibility of a code snippet. There are many open source libraries that will allow you to get started.  However, these libraries become cumbersome when you try to feature toggle at scale or restrict access to particular toggles. Depending on your needs, you could consider a feature toggle management platform to provide a system for access control and mitigating technical debt.

29 Nov 2016

Toggle Talk with Damian Brady

I sat down with Damian Brady, Solution Architect at Octopus Deploy for a conversation about his experience with feature toggles.  He shared with me his tips for best practices, philosophies on when to flag and what he thinks the future of feature flagging will look like. 

  • How long have you been feature flagging?

I had to think about this one a bit – about 8 years ago but I probably didn’t know what it was called at the time.

“It’s definitely the case that people are doing this without knowing the name “feature flag” or even giving it a name. They’re just saying it’s a configuration switch or a toggle and but not giving it a more proper name, they’re not identifying it as a first-class citizen really.”   

  • What do you prefer to call it and why?

Now I call it feature flagging or occasionally feature toggles. I think toggles makes a bit more sense as analogy for non-technical people.  

  • When do you think feature flagging is most useful?

There’s a couple – but the one I think it’s most useful for is to use a feature flag when you have a feature that is nearly complete or complete from your point of view. Either way, you are ready to get verification from someone with real data.

“You can test as much as you want with your pretend fake data, or even a dump from production which is being obfuscated, but until it gets used in the wild you’re never really sure that the feature is doing exactly what it needs to do.”  

So hiding that behind a feature flag, and then clicking it on for somebody who is using the product for real in any way gives you that last little test that is ultimately the most useful.  At that point you still have the opportunity to back out. If something was corrupt or your expectations were wrong, it’s really useful for that last-minute check.  

At Octopus, we’ve started using feature flags for big features that a lot of people don’t want to see. So a while ago we introduced the idea of a multi-tenant deployments. And probably most of our users don’t need that feature because it adds a lot of complexity to the UI.  We have a configuration section where you can toggle an “on” and “off” switch, so if you don’t need that feature you can just leave it off.     

Are there any cases where feature flagging is not a good idea?

I think there are two extremes where feature flagging is not a good idea. On one hand, flagging really small changes can be more trouble than it’s worth. It’s introducing an extra level of complexity that maybe for a small change is not critical.  

On the other side, using feature toggles around the architectural changes in the core of your application – that’s kind of hard to test. Do you have a feature flag that when you turn it “on” it completely redirects the way the entire application will run? In that case you bite the bullet and decide that this is a big change and you’re just going to have to test it very thoroughly and not give yourself a way out.

That being said, there are some cases where you still need to give yourself a way out by using a flag. For example, you might deploy some new feature thinking it’s correct, but subsequently learn from a customer or user that it doesn’t really meet their needs. Rather than the user living with a bad feature, you might want to turn the flag off and go back to the drawing board.

If it’s an architectural change, you may only find out that there’s a bug when you use it in production. Test data may not surface the issue properly.

Ultimately, doing core architecture changes in a way where you can back out later can be an extra huge amount of work. It’s probably at that point you know you aren’t going to do it (revert back) anyway.   

  • Best use of a feature flag – a personal story?

When I first started using feature flags, around the 8 years ago timeframe, I was working on a web application that was internal and a big line of business.  And we had just added a new third party provider for providing SMS.  And with this new provider, it meant we had to write a lot of new code.  It was internet banking software so it was a one-time password we were sending out – and it was really really important that it work.

We tested everything rigorously but wanted more insurance.  So we put the new service behind a feature flag. We had a bunch of agents that ran this type of SMS. We enabled a flag for one of the agents and monitored it to make sure it was actually doing the right thing and not failing. And then we started trying other ones. It failed a couple of times because of differences in the sandbox environment between the third-party provider and the real one.

“We thought everything was okay, but when we put it live we turned it on slowly, and it didn’t do what we expected.”

So when that happened we turned it back off again…and went back to the drawing board.

So without the feature flag, we would have dropped every person using the service at that critical point. That client would have not been able to receive SMS’s until we were able to rollback.

  • What do you think is the number one mistake that’s made around feature flagging?

There is one that I keep seeing – when you wrap a new feature you believe to be finished in a flag, the biggest mistake with this is not testing that change with the flag “on” and then “off”. For instance, when you turn it “on” it snaps into new database tables or starts changing the way data is saved. But when you turn it “off” again, you’ve lost that data or data is corrupt. For this you need to test it “on” and “off”.  

“If you have more than one feature flag running at the same time, test the combinations of them being both ‘on’ and ‘off’.

If they’re likely to interact with each other you need to test “one on, one off,” “both on,” “both off” and all possible combinations like this.  

  • How do you think feature flags play into the DevOps movement? What about Continuous Delivery?

I think feature flags play in both continuous delivery and continuous deployment. I think they’re most useful to continuous deployment. You have all of your features pushed out to production as soon as they compile essentially – but they are behind a feature flag so you don’t break anything. That’s the way Facebook does it. They know that any new code they write might end up in production so it’s going to be safe behind a feature flag.  

“The design of the DevOps movement, the aim of it really, is to get real features and real value in the hands of users as quickly as possible.”

So if you have to wait until this “half done piece of work” is actually safe to deploy then that slows you down. So having it behind a feature flag so that it doesn’t get touched until you are ready to test it can be really powerful for increasing velocity and getting things out to production much faster.     

So even for marketing teams, it means they don’t have to tell the developers “hey we worked out the result of this a/b test and we want option b.” If the marketing department can just just flip that switch and say “no, option b is working better so just leave it there” without a new deployment or contacting the developers to remove the old stuff and redirect to the new stuff,  that increases that team effort of getting value to customers which is the whole purpose of DevOps.

  • Can you share any tips for better flagging?

If you’re feature flagging a big change, pair the feature flag with a branch by abstraction pattern. See the clip from my talk from NDC Sydney for more details.

There’s also the concept of transitional deployments – again refer to my video clip here for more. It’s useful for things like database schema changes where you have a midpoint for both the new and old applications that will work with the schema that’s currently there. So you can turn that feature off if you need to.

  • Are you seeing feature flagging evolving? If so how?  And how do you expect it to change in the future?

It’s been around for a long time…but I think it’s becoming much more visible – and partly it’s LaunchDarkly helping with that. I think more people will start using feature flags in their continuous delivery pipeline. And the more continuous delivery becomes mainstream, the more mainstream developers will need feature flags.  

“I think feature flagging is starting to be something that you have to add your deployment cycle because you know it needs to be fast and you know you feature needs to get to production as quickly as possible – and feature flags are the way to do that.”  

So as it becomes more mainstream I think there will be more tools, more frameworks, more awareness of it (feature flags) as it hits more and more companies. I think there will be things coming out like feature flag-aware testing tools – so testing tools that know that they need to test with this flag on and off.  

The summary – more tools around best practices around this thing which is becoming more mainstream.  With DevOps becoming more popular, more people are thinking “yes we need to get to production quicker, we need that cycle time to reduce” so it’s a natural extension I think to start solving some of those problems with feature flags.  

“I think it’s just starting to become more mainstream frankly because it’s a solution to a problem that is starting to become more mainstream.”