14 Sep 2017

Beta Testing with Feature Toggles: Testing in Production Like a Pro

We all know beta testing is important—not just for understanding your customers’ needs, but also for stability and security. Every time you do a launch you are essentially asking: “Are there bugs? Is there feedback?” Both with the goal of making your product better.

Testing in production will give you the most information about the success of your new functionality. And because feature flags help separate deployment from release, they make such testing safe and easy. When it comes to beta testing, a lot of the top companies tend to adhere to a similar paradigm—test early, test often, and do it in your production environment.

So how do companies have smooth and simple transitions from alpha to beta testing, and then to full rollout? Read on to learn how top companies are approaching their beta testing using deployment tools with feature flags providing links out to more in-depth descriptions.

But before we get started, here’s a quick terminology review. Pete Hodgson refers to this use of feature flags for betas as “permissioning toggles.” Also known as a “canary launch,” this is often random like a percentage rollout. A set group, or “champagne brunch,” releases to internal users or another section or group.


6 Approaches to Product Launching

#1 Facebook is the prime example of dark launching. Their release management has to be impeccable to operate at such massive scale. Their betas are often up to  million users or more.

“Although we push to production only once a week, it’s still important to test the code early in real-world settings so that engineers can get quick feedback. We make mobile release candidates available every day for canary users, including 1 million or so Android beta testers.”

Read their article on Rapid Release at Massive Scale to learn more about how they do continuous delivery at scale.

#2 Hootsuite gives a typical rollout pattern for its features—starting internally and then slowly exposing to a larger audience.

Typical
Push new code then:
– Dark launch to yourself or your team to test
– Launch to the whole Hootsuite organization
– 10% of all users
– Watch graphs
– 50%
– 100%
– Simple means of rollback if necessary

Check out Bill Monkman’s full deck on dark launching here.

#3 Etsy calls feature flags “config flags,” and gives a lot of credit for their process to Flickr.

“Key system-level and business level metrics (like checkout/listing/registration/sign-in rates) are projected on screens in the office and we have a number of internal dashboards that the team uses (we mainly use Ganglia and Graphite). We also have lots of switches and knobs to help us roll features out to percentages of users and ramp them up slowly, or quickly. Features are used and tested by us here at Etsy for some period of time before they are rolled out publicly.”

They have custom built a feature flagging API, “Feature API” to enable this. Some of the bucketing they use include: admin, internal, users, groups.Read more about Etsy’s deployment practices and check out their Feature API on GitHub.

#4 Beta can also apply to back-end rollouts. Instagram does canary deployments to a subset of servers, using feature flags as a continuous delivery tool. It’s important for continuous delivery to perform these tests, which are key in helping them avoid failed deployments.

But Instagram hasn’t always had this system. Read here to learn how they evolved from a “mish-mash of manual steps and scripts” to a system they could depend on. And check this out if you want more recipes for database migration with feature flags.

#5 Niantic’s Pokemon Go betas are well known and rabidly tracked by its fans. They famously roll out by region—a field test in Japan here, a limited beta in Australia, and then something in New Zealand. Sometimes these betas for features are invite-only. Here’s a write up of how they approached the rollout of the game Ingress.

#6 GoPro released their GoPro Plus product early using feature flags. By breaking the larger release into smaller features with their own testing timelines, they were able to iterate and improve continuously. The video below walks through the technology they used and the timeline from dogfood to a “big bang” marketing announcement.

“At GoPro you can kind of tell we don’t things lightly. We want to do big announcements and we want to come out with great products…we actually had smaller features that would go out, and then go for alpha testing and beta testing along the way. Shortly after March, we actually had most of the applications done from a core feature standpoint, but we kept iterating and improving those core features that we knew we were going to launch with.”

 

Controlling Your Rollout Like a Boss

Did you notice some trends there? These larger companies are using beta testing to do one of the following:

  • Testing in production with feature flags
  • Ability to release early and test small functionalities before a broader release
  • Internal tests that easily become external canaries
  • Regional rollouts

As more companies start to use feature management, these incremental rollouts are not the headaches they once were. Companies can be safer and smarter with how and when they expose features to their end users.

If you want to get started with feature flagging, check out featureflags.io a resource we made for the community to learn best practices.  

08 Sep 2017

Saving Private Instances

You are a new Director of Engineering at an enterprise company. The company has just moved your product to the cloud and is hosting in Microsoft Azure. You come from a tech startup where you ran all of your software in the cloud and focused on building product. So naturally you have just implemented instrumented monitoring with HoneyComb, Kubernetes to manage your containers, and you’re thinking about leveraging serverless architecture.

You moved to this established enterprise company because they are providing technology to the healthcare industry and you are passionate about this space. They want you because you are good.

Now you face a challenge—how can I bring the good things from my past role and pair them with the security standards and compliance in my new role? Well, let’s first think about what the good things from your past role entails. Your previous team:

  • Deployed 5 times a day
  • Used feature branching
  • Feature flag first development
  • Implemented continuous integration

You realize you used 3rd party software for most of these good things, and had homegrown for others. You decide your first project is to research what software the new team is using, has built, and what software you will need to build and buy.

Security and the cloud.

Now, before we dive in, let’s consider the security standards and compliance you need to think about in your new role:

 

  • Where is my data stored?
  • What data is being sent over to the third party—is it PII?
  • Where connections are made to third parties?
  • How can I set different access control?
  • How it will work when connection fails-redundancy?
  • Is it HIPAA compliant?
  • How is all this audited?

 

Like your new company, many companies are just starting to move key operations to the cloud. This is SCARY. Moving into Azure will allow your company to free up floor space, add more server redundancy, easily scale up and down, only pay for what you use, and focus resources on revenue generating activities— security is still your responsibility.  And though these are all great things, you are not a security company.

Now you are tasked with bringing in these good processes, and they include 3rd party software providers. Products in this space are inherent in your development lifecycle and very close to your core application, but you need to ensure they are secure.

As we all know, most software providers in this space are cool tech companies in The Valley, far and isolated from the realities of your secure enterprise world. You look at on-premise options. And you remember that you’re driving change in the Healthcare space, not looking to manage infrastructure. The cloud options require a significant audit that is time consuming. A SaaS provider carries an ongoing risk that requires you store some data on 3rd party servers, which is a non-starter in Healthcare.

Is there a middle ground?

There is one more option many software providers offer: Private SaaS, also known as a managed instance, Dedicated VPC, or private instance. You surmise if they do not have on-prem, private SaaS, or a really really good security team, then it’s not likely their team is accustomed to working on enterprise challenges.

What exactly is a private SaaS offering? This refers to a dedicated single tenant, cloud software where the vendor manages the infrastructure.

Why choose a private instance?

  • Single Tenant—You will have a dedicated set of infrastructure contained within a VPC. This eliminates the risk of noisy neighbors.
  • Data Storage—Data can be stored in your AWS account or in an isolated section of the vendor’s AWS account. This allows flexibility, if you are in the EU, vendor could spin up the instance in AWS in the EU.
  • Residency—If you have a preference on where the instance is located or need to ensure proximity.
  • Compliance—If you have additional security or compliance regulations, a private instance can be customized to fit your needs.
  • Change Cadence—If you need to know when the software will be updated, you will have better insight and greater flexibility with a private instance.
  • Integrations—If you have custom tools, integrations can be built.

What’s next?

You have narrowed down a few software providers for each job you are trying to solve (continuous integration, branching, feature flagging). You understand the value, the costs, and the security implications. You have chosen to stay in the cloud, which makes your infrastructure team happy. You have chosen to use private instances and SaaS where it makes sense, which makes your security team happy. And you have the tools you need to bring the good things into your new role, so you are happy.

Now you can focus on helping your company deliver product faster, eliminate risk in your release process, speed up your product feedback loop, and do it all securely.

01 Sep 2017

More Experiments. More Data. Better Products.

Hi!
My name is Melissa, and I recently joined LaunchDarkly. I’d like to introduce myself and share why I now wear a LaunchDarkly tee.

I’m a designer. UX, UI, branding, marketing, strategy—you name it. My previous role was design manager at a large online retailer, which you know but I won’t name. This online retailer asked me to join a small team to build a sister site with the goal of testing new shopping experiences that were too risky for the main site. There were one product manager, two designers, and three developers.

The challenge.
We designed and built a live MVP of this new e-commerce platform in 3.5 months. But then we faced a new challenge: how do you direct customers to an entirely new and much more progressive experience without scaring away loyal users or disturbing sales?

The solution.
We decided to invite a small percentage of a particular customer segment to view the website, a process called Canary Launch. For us, this looked like a few ads populating the main site, but only visible to 1% of our chosen customer segment. We were able to monitor the impact on sales on the main site and felt confident to increase visibility to the ads. This process allowed the business stakeholders to be at ease and gave us the data we needed.

A new vision.
This release process was eye-opening for myself and the entire team. We had just come from the legacy site where the release cycles are long, and the risk of conflicting code is high. Even though we were excited about the strategy, it did take a lot of development time and wouldn’t give the rest of the team access to the backend. We would ask the developers to fluctuate the ad visibility which would take them away from their primary focus.

This brings me back to why I now wear a LaunchDarkly tee. When I heard of their SaaS product, my last 24 months flashed before my eyes and the understanding of its value made me fall in love. Then I met the team. I was hooked.

I’m now excited to help share the message and am committed to helping companies of all sizes understand how LaunchDarkly can be a facilitator when it comes to faster and less risky product releases. After using the product for a few months, you will look back and wonder how you ever built without it.

I see a bright future for the “build, test, learn” model. The world is innovating at increasing speeds these days. You have to move just as fast to be competitive and deliver products and experiences that connect with your users.

28 Aug 2017

All the Pretty Ponies

August is always full of security awareness in the wake of DefCon, BlackHat USA and their associated security conference satellites. Las Vegas fills with people excited about digital and physical lockpicking, breakout talks feature nightmare-inducing security vulnerabilities, and trivially simple vote machine hacking. The ultimate in backhanded awards are given out to companies and organizations who made the world less secure, usually because of an overlooked flaw.

The Pwnie Awards (pronounced pony) are the security industry recognizing and mocking organizations who have failed to protect their data and their users. There are categories for:

  • Server-side Bug
  • Client-side Bug
  • Privilege Escalation Bug
  • Cryptographic Attack
  • Best Backdoor
  • Best Branding
  • Most Epic Achievement
  • Most Innovative Research
  • Lamest Vendor Response
  • Most Over-hyped Bug
  • Most Epic Fail
  • Most Epic 0wnage
  • Lifetime Achievement Award

And last but not least:

  • Best Song

This year’s best song is a parody of Adele’s HelloHello From The Other Side is complete with a demonstration of the exploit and a lyrical summation of what they’re doing.

Across the industry, security vulnerabilities are given a tracking number (Common Vulnerabilities and Exposures or CVE) and described in a semi-standard system. This helps everyone understand which category they fall into  so they can read and understand vulnerabilities that may be outside their area of expertise. This also helps us talk about vulnerabilities without resorting to sensational names like “Heartbleed”.

Since it is August and the 2017 awards have been announced, I got to wondering whether any of the Pwnie-winning vulnerabilities could have been prevented by feature flags. There are several that could possibly have been mitigated by the ability to turn a feature off, but I chose this one:

Pwnie for Epic 0wnage
0wnage, measured in owws, can be delivered in mass quantities to a single organization or distributed across the wider Internet population. The Epic 0wnage award goes to the hackers responsible for delivering the most damaging, widely publicized, or hilarious 0wnage. This award can also be awarded to the researcher responsible for disclosing the vulnerability or exploit that resulted in delivering the most owws across the Internet.

WannaCry
Credit: North Korea(?)

Shutting down German train systems and infrastructure was Child’s play for WannaCry. Take a legacy bug that has patches available, a leaked (“NSA”) 0day that exploits said bug, and let it loose by a country whose offensive cyber units are tasked with bringing in their own revenue to support themselves and yes, we all do wanna cry.

An Internet work that makes the worms of the late 1990s and early 2000s blush has it all: ransomware, nation state actors, un-patched MS Windows systems, and copy-cat follow on worms Are you not entertained?!?!?

WannaCry was especially interesting because it got “sinkholed” by a security researcher called MalwareTech. (I know, he’s under indictment. Security researchers are interesting people.) He noticed that the ransomware was calling out to a domain that wasn’t actually registered, so he registered the domain himself. That gave lots of people time to patch their systems instead of getting infected.

As an industry, we tend to think of server uptime as a good thing. But is it? Not for any single server, because if it’s been up for a thousand days, it also hasn’t been rebooted for patches in over three years. You may remember early 2017 when Amazon S3 went down? That was also due to a restart on a system that no one had rebooted in ages.

So how can feature flags help keep production servers current?

  • It’s safer to make a change in production if you know you can revert it instantly using a feature flag kill switch.
  • Small, iterative development means that you can ship changes more quickly with less risk—the more often you reboot a server, the less important that particular server’s uptime is.
  • Many infrastructures today are built with the concept of ‘infrastructure as code’ through the use of automation tooling. These automation configurations can also benefit from the use of feature flags to roll-out system or component versions alongside your application updates.

Second we have:

Pwnie for Best Cryptographic Attack (new for 2016!)
Awarded to the researchers who discovered the most impactful cryptographic attack against real-world systems, protocols, or algorithms. This isn’t some academic conference where we care about theoretical minutiae in obscure algorithms, this category requires actual pwnage.

The first collision for full SHA-1
Credit: Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, Yarik Markov

The SHAttered attack team generated the first known collision for full SHA-1. The team produced two PDF documents that were different that produced the same SHA-1 hash. The techniques used to do this led to an a 100k speed increase over the brute force attack that relies on the birthday paradox, making this attack practical by a reasonably (Valasek-rich?) well funded adversary. A practical collision like this, moves folks still relying on a deprecated protocol to action.

SHA-1 was a cryptographic standard for years, and you can still select it as the encryption for a lot of software. This exploit makes it clear that we need to stop letting anyone use it—and for their own good.

How a feature flag might have made this better:

  • Turn off the ability to select SHA-1 as an encryption option
  • Force current SHA-1 users to choose a new encryption option

Some feature flags are intended to be short-term and will be removed from the code once the feature is fully incorporated. Others exist longer-term, as a way to segment off possible dangerous sections of code that may need to be removed in a hurry. Feature flags could be combined with detection, like I suggested for the SHA-1 problem, and used to drive updates and vulnerability analysis.

It’s easy for us to think of deployment as something a long way from security exploits that involve physical access to computers and networks, but security exploits are “moving left” at the same time as all the rest of our technology. Fewer of us manage physical servers, and fewer security vulnerabilities relate to inserting unknown USB sticks into our laptops. Instead, we’re moving to virtual machines and containers, and so are our vulnerabilities. Constructing code, cookbooks, and scripts that allow us to change the path of execution after deployment gives us more options for stay far, far away from the bright light of the Pwnie Awards.

18 Aug 2017

Risk Reduction and Harm Mitigation

Risk Reduction is trying to make sure bad things happen as rarely as possible. It is anti-lock brakes, vaccinations, clothing irons that turn off by themselves, and all sorts of things that we think of as safety modifications in our life. We are trying to build lives where bad things happen less often.

Harm Mitigation is what we do to make sure that when bad things do happen, they are less catastrophic. Fire sprinklers in buildings, seat belts, and needle exchanges are all about making the consequences of something bad less terrible.

What does that mean for a DevOps world where our risks and harms are very different? Charity Majors says we shouldn’t split the world into developers and operations, but into product and infrastructure—and I think that’s a useful way to think about risk too.

Product risk is problems that users experience. We can usually predict and mitigate the danger by testing and being aware of common failings. For example, we can expect and plan for users that may be on a flaky connection, or that they may try to exit a page without saving information. We can work around these problems because we know they’re out there. But our deployments are not something they should see until we’re ready for prime time. For this kind of risk, we use feature flags to control what content is delivered.

Infrastructure risk is more about the inherent fragility of delivering software. CDNs, fiber, servers, switches, towers…the whole system of getting data to people has failure zones. When we are trying to reduce infrastructure risk, we assume that latency is ever-present, networks are intermittent, and we won’t always be able to count on everything working right the first time. We try to build in robust and flexible ways than can route around failures. This is the place we might use feature flags to control failovers or to create circuit breakers to prevent flooding a fragile sector.

Product harm reduction is about making sure that users can have a positive experience, even if something happens that makes it less than ideal. We want to preserve their blog drafts, keep them from committing errors, only show them things they are allowed to change, and above all, avoid giving them a blank page. Something has gone wrong in the trip to the user, but they shouldn’t have to suffer for it.

Infrastructure harm reduction is the ability to pull back breaking fixes, shunt users away from vulnerabilities, and respond near-immediately to things that have gone very wrong. Harm reduction at this level is the kind of action a pager-responder can take to get things back on track before doing more intensive repairs and investigations in the morning.

In my first week, I spent a lot of time thinking about how to summarize our product for a variety of audiences. “Feature flags as a service” is short and pithy, but only works if you can bring your own definition of “feature flag” and a business understanding of why you would use them. What about “Feature flags allow you to make changes in near real-time instead of waiting for a deployment, and LaunchDarkly helps you manage and track flags across an organization”?

Well, that works, but it still doesn’t get to the core of why an organization would want to use feature flags. What I’ve come up with so far is this:

Feature flags segment the risk of creating a product into manageable parts.

Creating and deploying software is risky. We can accidentally build in errors, we can deliver it badly, or to the wrong people, and it can interact in unfortunate ways with existing software or hardware. As organizations, we want to do our best to do no harm and provide benefit. Using feature flags lets us wrap our features in decision points that we can then use to make life easier for our users.

Here are some types of risks that are reduced by using feature flags:

  • Server falls over from too much traffic
  • Canary launch is not well-tracked, problems are missed
  • Old features and workarounds are invisible and get left in place
  • Feature with vulnerable content is deployed
  • API endpoints are exposed to unauthorized users

Managing your feature flags is a post for another day—today I ask that you take a few minutes and think about how you can reduce risk and minimize harm in your organization, your project, or your code. How can you make things robust enough to resist failure, instrumented sufficiently to identify a failure spot, and flexible enough to reduce harmful consequences on the fly?

16 Aug 2017

Week 1: How to Put Your SOC On

Enter Your Password

What does a new engineer do during their first week at a SOC 2 Compliant startup? Write code? Maybe. Deploy code? Hopefully. Create accounts? Certainly.  Generate passwords? Ad nauseam.

After creating my task tracking and document sharing accounts, half the items I saw on my TODO lists were about creating accounts on more services. Also on my calendar was to attend training for one of LaunchDarkly’s newest initiatives: SOC 2 Compliance.

At LaunchDarkly, we maintain mission critical services for our customers (feature flags!). And for those who opt for premium services, we also store sensitive data about their clients as part of our analytics features. It is essential to our business that we protect not only access to control over customer application behavior, but to all client data we store on behalf of our customers.

After our security training, each member of my incoming class made a commitment to:

  • Create a unique password for every service. Use a password generator and a password manager!
  • Enable 2-factor authentication for every service that offers it.
  • Avoid sharing passwords and accounts with team members to keep a precise audit trail.
  • Restrict browser plugins to the minimum necessary to do your job. Those plugins can read your data.
  • Secure your laptop with FileVault and lock screens.
  • Limit connected applications with access to Gmail, GitHub and other accounts.
  • Secure customer data. (Obfuscated links don’t cut it!)

These are all great practices even if your business doesn’t need SOC 2 certification. Now to deploy some code (if I can just remember where I’ve written down my SSH key…).