20 Oct 2017

Keeping Client-Side Feature Flags Secure

LaunchDarkly’s JavaScript SDK allows you to access feature flags in your client-side JavaScript. If you’re familiar with our server-side SDKs, setup for our JS SDK looks superficially similar—you can install the SDK via npm, yarn, or via CDN and be up and running with a few lines of code. However, if you dig a bit deeper into the source code and do an apples-to-apples comparison with our node SDK, you’ll see that our client-side SDK works completely differently under the hood.

There’s one simple reason for this difference—security. Because browsers are an untrusted environment, our client-side SDK uses a completely different approach that allows us to serve feature flags to your users securely. While we’ve made our approach as secure as possible out of the box, there are three best practices that we recommend all companies follow to maximize security when using client-side feature flags.

1. Choose Appropriate Use Cases

First, in a browser environment your entire JavaScript source code is fully exposed to the public. It may be minified and obfuscated, but obfuscation is not security. This means that if you’re using client-side feature flags, you must be aware that anyone can modify your source code directly and bypass your feature flags. For example, if you have a code snippet like this:

You can expect that any of your end users could inline edit the above code snippet in their browser window to show the brand new feature, bypassing the feature flag and its rollout rules entirely. This is a fact about how client-side feature flags work—not a limitation in the LaunchDarkly platform.

This doesn’t mean that client-side feature flags are always a security vulnerability—it just means that there are some use cases where a client-side feature flag alone isn’t appropriate. For example:

  • Launching a feature that is press-, publicity-, or competitor- sensitive. In this situation, having a user enable the feature early could derail a launch, or tip your company’s hand. Even the feature flag key itself (e.g. brand.new.feature) could be sensitive. In this scenario, you can’t even afford to push the new front-end code to production prior to launch, even if it’s kept dark.
  • Entitlements. A client-side feature flag alone is not a sufficient control for locking users out of functionality that they shouldn’t be able to access. In this case, a client-side feature flag is fine assuming it’s coupled to a backend feature flag. For example, you can use the same feature flag (with the same rollout rules) on both the front-end (to disable a UI element) and the back-end (to control access to a REST endpoint triggered by that same UI element).

Using client-side feature flags securely starts with choosing appropriate use cases. Ultimately, the choice depends on whether or not you can expose your new feature’s code to users that may not have the feature enabled.

2. Enable Secure Mode to Protect Customer Data

There’s a corollary to the observation that all JS code is fully exposed to the public—there is no way to securely pass authentication credentials to a 3rd party API from the front-end alone. Because anyone can view source, any API key or token that you pass to a 3rd party API via client-side JS is immediately exposed to users.

One solution to this problem is to sign requests—use a shared secret on the server-side to sign the data being passed to the 3rd party API to ensure that the data hasn’t been tampered with, or the request hasn’t been forged. Tools like Intercom take this approach. In Intercom’s case, request signing guarantees that a user can’t fetch chat conversations for a different user.

LaunchDarkly follows this approach as well. We call this Secure Mode, and when enabled, it guarantees that a user can’t impersonate another user and fetch their feature flag settings. Our server-side SDKs work in tandem with our client-side JavaScript SDK to make Secure Mode easy to implement. We recommend that all customers enable Secure Mode in their production environments.

3. Use a Package Manager to Bundle the SDK

There’s another aspect of security that comes with using any 3rd party service in your JavaScript—protecting yourself in case that service is attacked. Our recommended best practice is to embed our SDK in your code using a package manager, rather than injecting the SDK via script tag. We offer easy installation with popular package managers like npm, yarn, and bower. This does put the onus on you to upgrade when we release new versions of our SDK, but at the same time it can reduce latency to fetch the SDK and give you control in accepting updates.

Conclusion

We go to great lengths to ensure the security of our customers and their end users. We follow industry best practices in the operation of our service (we recently completed our SOC2 certification) in addition to providing advanced security controls in our product such as multi-factor authentication, scoped API access tokens, and role-based access controls. This approach to security extends to our client-side JavaScript SDK. By following the practices outlined here, you can ensure that your use of client-side feature flags keeps your data and your customers’ data safe.

18 Oct 2017

How We Beta Test at LaunchDarkly

Photo by Alex Holyoake on Unsplash

We recently looked at how some well-known companies beta test. Specifically we looked at groups that test in production, and do it well. As you know, testing in production is one of the best ways to find bugs and get solid feedback from your users. While some may shy away from this because of the risks involved, there are ways to mitigate risk and do it right. So this time we want to share how we beta at LaunchDarkly.

It’s no surprise that we dogfood at LaunchDarkly. Using feature flags within our development cycle is a straightforward process. We often push features directly into our production environment and safely test prior to allowing user access. When it’s time to beta test with users we can update the setting on the appropriate flag and get user feedback quickly. And of course, if we ever need to we can instantly turn features off.

Deciding Which Things to Build

When we’re thinking about new features to implement, we have our own ideas of which direction our product should go, but we also consider inbound requests. This can be from support tickets, questions from potential customers, or conversations with existing customers. Bottom line is we want to build a product that serves our customers, and so we do our best to listen to what they want.

Once we identify a feature we’d like to build—whether it was our own idea or a customer request—we’ll share it out to see if other customers are also interested. This is an important part of our beta testing process, because once the feature is in production and we’re ready to test it, these are the people we want to circle back with for beta testing.

Testing in Production

When it’s time to test, we test with actual end users in production. Our feature management platform allows us to turn features on for specific users. We can specify individual users, or we can expose users by attribute, like region (everyone in Denver)—and we can instantly turn them off at any time.

Because we’re testing in production, we don’t have to have an isolated environment or separate account. For those customers who showed interest, and agreed to participate in beta testing, we turn the features on in their production accounts.

Typically we beta for two weeks, sometimes as long as a month. As mentioned before, since we know which customers are interested in the feature, we can go back to them and have them test it. These are the users who already know they want this functionality, so we want to be sure it fits (or exceeds!) their expectations. And of course we want to make the most of this time, so it’s important we actually get feedback. We find that those who have asked for the feature are eager to let us know how things are working. We make a point of also following up with those who don’t proactively offer feedback—we want to hear from everyone!

While we’re testing and getting feedback, we’re taking all this information in and improving the feature before rolling it out to everyone else. When we feel confident we have something that’s ready to be shared, we’ll begin a percentage rollout to the rest of our users.

Embracing Failure

Using feature flags around features within our development cycles allows us to mitigate risk by pushing out small, incremental changes at a time. As you can see, this also enables us to beta test quickly and safely. If there are major bugs, we’re more likely to identify them early on before affecting our all of our customers.

“Embrace failure. Chaos and failure are your friends. The issue is not if you will fail, it is when you will fail, and whether you will notice.” -Charity Majors

Right now we’re currently in beta for scoped access tokens and a new faster .net SDK. Let us know if you’d like to take a look at it early, we’d love to hear what you think.

02 Oct 2017

Removing Risk from Product Launches: A webinar with LaunchDarkly, CircleCI, and GoPro

We recently sat down with one of GoPro’s Senior Engineering Managers, Andrew Maxwell, and the CTO of CircleCI to discuss reducing risk in product launches. Andrew talked about how his team delivered their code two weeks early, tested in production, and had an overall successful product launch. He goes into detail, sharing how his team uses continuous integration and feature flags to make product launches like that possible.

The Big Launch

Andrew’s team is responsible for web applications. Last September his team was focused on a big initiative around a product launch called GoPro Plus, which allows users to access and share content wherever they are. This launch included both mobile and desktop apps, and promoted two new cameras in the GoPro line.  

Two Weeks

The team delivered their code and pushed it into production two weeks early:

“…we used LaunchDarkly to push our code to production, turn the apps off — off by default—and then make sure that we had everything pushed out, deployed, and the infrastructure running live without customers actually seeing it.”

Two weeks gave them time to do a full integration test their features. They tested both in-house and in production—slowly opening up who had access—so they could get valuable feedback, find bugs, and make continuous improvements in the weeks leading up to the big launch. On the day-of, they were confident in their work and simply turned on 12 feature flags.

Check out the full webinar below to learn more about how the team used CircleCI and LaunchDarkly, their planning strategies, and best practices for their continuous integration pipeline.

26 Sep 2017

Do Away with Duct Tape: Infrastructure Rollouts a Safer Way

So I was told I need to write an introductory blog post for my first week at LaunchDarkly. Two months later and here I am writing my intro post. Seems like I got a little carried away writing code before getting to this! My name is Zuhaib and I will be working as Software Developer on anything back-end related. Previously I worked for a small startup called Atlassian on a chat product called HipChat.

Feature flags are not something new to me. I’ve seen a few homemade solutions in the past that always seem to leave me wanting more from them. Most systems would let you turn on or off features but had very limited ability to target users or control the percent rolled out. It was at Atlassian I got my first exposure to LaunchDarkly. We switched from our first in-house solution to LaunchDarkly and it was great. It helped us get new features out faster to our customers, and more importantly control features and backend service and disable them if they started to act up.

And as expected, we at LaunchDarkly do a lot of the same, using advance feature flags to make sure LaunchDarkly users have a good experience. One way we do that is by controlling rollout of infrastructure changes like database migrations.

Recently we needed to test an upgrade of our ElasticSearch cluster without impacting users. So we used a [percentage rollout] in LaunchDarkly and slowly targeted a subset of our users to the new cluster while we watched performance and stability.

Our flag allows us to control which cluster gets writes, which cluster gets reads, or which gets both. If we find a problem, we disable the flag and users go back to the other cluster. If it’s performing well we increase the percentage of users using the new database. You can see today we have rolled out the new cluster to 75% of users and working on getting it up to 100%. The code for this is as simple as adding a new statement that writes to either or both clusters:

While we may only be evaluating this second cluster for a short period of time, we’re actually leaving the flag in place, as it gives us a clean control mechanism for recovering from backups, or performing major version upgrades to ElasticSearch without customer impact.

This is just one of the many things you can do with LaunchDarkly flags for infrastructure changes.

19 Sep 2017

OpenAPI and Transparent Process

At LaunchDarkly, we’ve put a bunch of time into making our console fast and usable, and we’re pretty proud of it.

However, we’re aware there are lots of reasons people would want to use an API to create and manage feature flags. Since we are using our API to drive the dashboard, it’s easy for us to keep everything in sync if we make changes to the API. And we wanted to make it easy for our customers to do the same.

We started out with ReadMe, which is an excellent industry standard. That let us get our documents published and dynamic. For further refinements, we went to Swagger/OpenAPI. We liked it because:

  • It’s a well-known and widely-used format
  • It allows us to generate usable code snippets and examples in many languages automatically
  • It’s easy to add context and documentation to as we go.
  • We can host it on readme.io or other places, depending on our traffic needs.

We created our REST specification using OpenAPI, and you can find our documents here: LaunchDarkly OpenAPI.

We’re still working on adding examples, descriptions, and context, but we think that the documentation is stronger and more usable already.

As always, if you have any comments, you can contact us here, or make comments directly in the repository.

28 Aug 2017

All the Pretty Ponies

August is always full of security awareness in the wake of DefCon, BlackHat USA and their associated security conference satellites. Las Vegas fills with people excited about digital and physical lockpicking, breakout talks feature nightmare-inducing security vulnerabilities, and trivially simple vote machine hacking. The ultimate in backhanded awards are given out to companies and organizations who made the world less secure, usually because of an overlooked flaw.

The Pwnie Awards (pronounced pony) are the security industry recognizing and mocking organizations who have failed to protect their data and their users. There are categories for:

  • Server-side Bug
  • Client-side Bug
  • Privilege Escalation Bug
  • Cryptographic Attack
  • Best Backdoor
  • Best Branding
  • Most Epic Achievement
  • Most Innovative Research
  • Lamest Vendor Response
  • Most Over-hyped Bug
  • Most Epic Fail
  • Most Epic 0wnage
  • Lifetime Achievement Award

And last but not least:

  • Best Song

This year’s best song is a parody of Adele’s HelloHello From The Other Side is complete with a demonstration of the exploit and a lyrical summation of what they’re doing.

Across the industry, security vulnerabilities are given a tracking number (Common Vulnerabilities and Exposures or CVE) and described in a semi-standard system. This helps everyone understand which category they fall into  so they can read and understand vulnerabilities that may be outside their area of expertise. This also helps us talk about vulnerabilities without resorting to sensational names like “Heartbleed”.

Since it is August and the 2017 awards have been announced, I got to wondering whether any of the Pwnie-winning vulnerabilities could have been prevented by feature flags. There are several that could possibly have been mitigated by the ability to turn a feature off, but I chose this one:

Pwnie for Epic 0wnage
0wnage, measured in owws, can be delivered in mass quantities to a single organization or distributed across the wider Internet population. The Epic 0wnage award goes to the hackers responsible for delivering the most damaging, widely publicized, or hilarious 0wnage. This award can also be awarded to the researcher responsible for disclosing the vulnerability or exploit that resulted in delivering the most owws across the Internet.

WannaCry
Credit: North Korea(?)

Shutting down German train systems and infrastructure was Child’s play for WannaCry. Take a legacy bug that has patches available, a leaked (“NSA”) 0day that exploits said bug, and let it loose by a country whose offensive cyber units are tasked with bringing in their own revenue to support themselves and yes, we all do wanna cry.

An Internet work that makes the worms of the late 1990s and early 2000s blush has it all: ransomware, nation state actors, un-patched MS Windows systems, and copy-cat follow on worms Are you not entertained?!?!?

WannaCry was especially interesting because it got “sinkholed” by a security researcher called MalwareTech. (I know, he’s under indictment. Security researchers are interesting people.) He noticed that the ransomware was calling out to a domain that wasn’t actually registered, so he registered the domain himself. That gave lots of people time to patch their systems instead of getting infected.

As an industry, we tend to think of server uptime as a good thing. But is it? Not for any single server, because if it’s been up for a thousand days, it also hasn’t been rebooted for patches in over three years. You may remember early 2017 when Amazon S3 went down? That was also due to a restart on a system that no one had rebooted in ages.

So how can feature flags help keep production servers current?

  • It’s safer to make a change in production if you know you can revert it instantly using a feature flag kill switch.
  • Small, iterative development means that you can ship changes more quickly with less risk—the more often you reboot a server, the less important that particular server’s uptime is.
  • Many infrastructures today are built with the concept of ‘infrastructure as code’ through the use of automation tooling. These automation configurations can also benefit from the use of feature flags to roll-out system or component versions alongside your application updates.

Second we have:

Pwnie for Best Cryptographic Attack (new for 2016!)
Awarded to the researchers who discovered the most impactful cryptographic attack against real-world systems, protocols, or algorithms. This isn’t some academic conference where we care about theoretical minutiae in obscure algorithms, this category requires actual pwnage.

The first collision for full SHA-1
Credit: Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, Yarik Markov

The SHAttered attack team generated the first known collision for full SHA-1. The team produced two PDF documents that were different that produced the same SHA-1 hash. The techniques used to do this led to an a 100k speed increase over the brute force attack that relies on the birthday paradox, making this attack practical by a reasonably (Valasek-rich?) well funded adversary. A practical collision like this, moves folks still relying on a deprecated protocol to action.

SHA-1 was a cryptographic standard for years, and you can still select it as the encryption for a lot of software. This exploit makes it clear that we need to stop letting anyone use it—and for their own good.

How a feature flag might have made this better:

  • Turn off the ability to select SHA-1 as an encryption option
  • Force current SHA-1 users to choose a new encryption option

Some feature flags are intended to be short-term and will be removed from the code once the feature is fully incorporated. Others exist longer-term, as a way to segment off possible dangerous sections of code that may need to be removed in a hurry. Feature flags could be combined with detection, like I suggested for the SHA-1 problem, and used to drive updates and vulnerability analysis.

It’s easy for us to think of deployment as something a long way from security exploits that involve physical access to computers and networks, but security exploits are “moving left” at the same time as all the rest of our technology. Fewer of us manage physical servers, and fewer security vulnerabilities relate to inserting unknown USB sticks into our laptops. Instead, we’re moving to virtual machines and containers, and so are our vulnerabilities. Constructing code, cookbooks, and scripts that allow us to change the path of execution after deployment gives us more options for stay far, far away from the bright light of the Pwnie Awards.