In this episode, Edith and Paul are joined by Scott Raney, Partner at Redpoint Ventures. Edith, and Paul hear from Scott why continuous delivery matters in modern software development. This is episode #21 in the To Be Continuous podcast series all about continuous delivery and software development.
In this episode, Edith and Paul discuss the spectrum of Continuous Delivery, and where Continuous Delivery will and won’t work in software development. This is podcast #6 in the To Be Continuous podcast series all about continuous delivery and software development.
This episode of To Be Continuous, brought to you by Heavybit and hosted by Paul Biggar of CircleCI and Edith Harbaugh of LaunchDarkly. To learn more about Heavybit, visit heavybit.com. While you’re there, check out their library, home to great educational talks from other developer company founders and industry leaders.
Paul: Hi, I’m Paul Biggar, founder of CircleCI.
Edith: I’m Edith Harbaugh, CEO and co-founder at LaunchDarkly.
Paul: You are listening to To Be Continuous, a podcast about continuous delivery and software development.
Edith: You can get in touch with us any time at our Twitter handle, @continuouscast.
Paul: The show is brought to you by Heavybit. To learn more visit heavybit.com. While you are there check out their library, home to great educational talks from other developer company founders and industry leaders. In this episode, we discuss the spectrum of continuous delivery and where continuous delivery will and won’t work.
Edith: Paul, does one have to be all in to do continuous delivery?
Paul: I think the obvious answer is no. In particular, you know that the answer must be no because nobody starts at all in or at least, if you are starting a new project you can be all in and many projects are not all in. Someone starts somewhere and they say, “Now we are going to do continuous delivery in some way.” They have to go from zero to full continuous delivery along some way.
Edith: I think people have this conception that continuous delivery just means pushing any time you want. A lot of very old school development shops did that. They would just hot patch in production constantly.
Paul: Are we calling that continuous delivery? I feel that the idea of SSHing into the server is not continuous delivery somehow.
Edith: People have this misconception because they are like, “Oh I did a continuous delivery before this was a thing, I just patched information.”
Paul: I am going to put a lot of money on the line here and say that if you define continuous delivery as I SSH into the production server, then you are not doing continuous delivery right. In fact you are doing it considerably wrong.
Edith: What exactly are you doing wrong, then? What steps are you skipping?
Paul: The whole point is that you have this automated process for releasing software, and SSHing into the production server is the very opposite of that. That’s not to say you shouldn’t do it in the occasional emergency. Occasionally if we have some sort of catastrophe we will SSH into the production server, replace some code, execute some code on the server to keep things as they are and prevent customers from noticing or from things from going down but that isn’t the same as a continuous delivery pipeline or a release workflow or an automated release work flow which is essentially what continuous delivery is.
Edith: Continuous delivery isn’t just shipping stuff out to production. It’s more having all the processes behind it.
Paul: It’s very much the process. When you think about can you be a little bit continuous delivery, I think what you are looking at saying is, is there a lightweight version where we get some benefits of the continuous delivery as we go between all and nothing?
Edith: Do you have to be all in or can you be a little bit continuous?
Paul: I think the very obvious instance of a little bit continuous is where you deliver the code. Where you have the automated release process builds some sort of artifact and then someone manually deploys that by a single click, or in fact where a build happens and test happens and the remaining automated release process is triggered by a human saying, “Now we are going to release.” Then everything else is scripted so they just have one input which is now.
Edith: That’s really interesting. Do you think to truly be continuous it has to all be completely automated with everything always pushed?
Paul: No, no I don’t think so. I think it largely depends on the size of the team and that sort of thing but I have seen a couple of examples where our customers were asking for particular things and indicated that they had sort of semi-continuous workflows. One example of that is a team that only wants to deliver or only wants to do continuous delivery during work hours.
There was another team which only wanted to do continuous delivery at midnight because that was the only time that they didn’t have enough customers on their site to be able to clear the caches. If they wanted to go to complete continuous delivery they would have to say we are going to clear a subset of the caches, or have some other caching policy that allows us to continue, or that allows us to deploy at any time but where they are is basically, we are semi-continuous.
Edith: I had a really interesting chat with Douglas Squirrel who actually listened to our podcast, and he does a lot of consulting for companies in England. He said he goes to a lot of companies that have all the tooling to be continuous but they don’t actually use it.
Paul: What does that mean exactly?
Edith: They have set up continuous integration. They could in theory deploy whenever they wanted but they still, for whatever reason, still cling to their old schedules.
Paul: Yes, that sounds right.
Edith: Jocelyn Goldfein who was at Facebook, right?
Paul: Yes, yes. That’s right. What she talked about was this idea that delivery schedules are useful in certain contexts. In some cases velocity matters, and that’s the Facebook case, and in some cases predictability matters a lot more. If you are delivering enterprise software for example, an enterprise is not going to allow you to continuously deliver your product into their firewall or into their enterprise. What they are going to want is they are going to want every quarter, a new thing.
Multiple times a year preferably, but each time you release to them creates a new workflow. They want to be able to audit it and see a change log and validate that things are, that they might have a change process internally. It’s not necessarily going to be a good idea to continuously deliver there but a team that is building for that can do a continuous delivery process internally where they produce all of the artifacts and they do a complete release on every single push even if that is not released.
Edith: There is a lot of reasons why an enterprise doesn’t want to take new features constantly and a lot of it comes down to training. If there are big new features that need to be rolled out to people, then they have to take on the cost of training. It was actually interesting, I chatted with a guy from Square and he talked about how they really try to limit the amount of pushes they put out to the actual cashiers. Because of the training cost, a cashier at a coffee shop is doing a gazillion other things and they can’t just suddenly pop up to new functionality and expect them to grok it or even take a tour.
Paul: Got you. Right, right, right. That’s an interesting idea.
Edith: Although they have all the processes in place to do a lot of changes and to do continuous delivery, they actually really limit what they push out to the true end users.
Paul: It’s funny that the people who have far, far more users, someone like Facebook, can just do those changes because they don’t necessarily value an individual user on a … Not saying that they don’t value their users but …
Edith: You’re about to … You’re walking down a path.
Paul: It is safe to piss off an individual user. Less so for someone like Square and as you go up the value chain, the more they are paying you or the more involvement they have to have in the product decisions, the less able you are to piss them off.
Edith: Facebook is actually a fascinating example because at any time they are going to be pissing off somebody. It’s kind of like Apple, you know they are always pissing off somebody it’s just who.
Paul: Did you hear about the day at Facebook where all the feature flags went on?
Edith: It was LinkedIn.
Paul: Was it was LinkedIn? Okay.
Edith: Yeah. It’s probably happened to everyone that’s done feature flags, there is the day that that happens and it’s ugly. Facebook is fascinating. Actually, I wrote a blog post about Facebook and how they use feature flags. They have a product called Gatekeeper that controls everything, everything. After they had a huge PR disaster, I think it was in 2011 with their news feed, they actually do really care if they piss off a lot of people and that’s why they implemented Gatekeeper where they do a lot of 1% roll outs.
Paul: I remember when I was setting up Circle I sat down and talked to my friend who worked at Facebook and he talked about their workflow and that sort of thing, and their release process was that code got released on Tuesdays. Everyone who was releasing code or who had code in that bit that was going out had to be available during the release process. The release process didn’t turn on any features because that was all done by Gatekeeper, but the code itself had to go out and they had to validate that everything stayed the same and everyone had to be there. They had a fairly continuous delivery process but were still, manually there was a release going out.
Edith: You want to have some sort of enforcement that people look at stuff and care.
Paul: Care in what sense?
Edith: The example I heard about also at Facebook is it’s so large that there is just no way that all the groups can coordinate with each other. Literally what they do is, when they roll stuff out with Gatekeeper, they roll it out internally to Facebook only employees. That’s how they find out if stuff breaks. It’s not like at a smaller company like Circle or …
Paul: There is a difference between the code being rolled out and the feature being rolled out, and they don’t do … I don’t know what the story is nowadays, but at the time they certainly didn’t do continuous delivery on code, or continuous releases or whatever you might call it.
Edith: I don’t actually work at Facebook but what I did hear was that they just literally can’t … When you are a small company you could still coordinate in a HipChat or a Slack room like, “Hey, we’re pushing this now, start looking for changes,” but with Facebook the only way they know if something is broken is if somebody complains.
Paul: There is an interesting discussion about microservices at AWS. I guess they didn’t call them microservices at the time but being fully service oriented architectured. One of the things they talked about is how when you roll, or when you continuously deploy a service, you might break other services. The thing that actually breaks may not be in any way related to the people whose alarms start going off and in fact they may be multiple steps away from whose alarms are going off.
Edith: Google has a similar system and I heard that they monitor 200-plus different metrics. One of the ones they monitor is just, do people say Gmail sucks more? Like on Twitter.
Paul: Interesting. The thing that I am getting at in terms of how little or much can you continuously deliver is if you are a Google or you’re Facebook and you have the monolithic code repos, the thing that you are releasing is, it’s not the same release process as someone like Amazon who has what I would describe as pure continuous delivery. Every service does its own continuous delivery and your dependencies may be changing in any time and you don’t actually know what is going on.
Versus Facebook that we are taking this multi-gigabyte binary and we are uploading it to a set of servers. Sure, we are slowly releasing and maybe we’re doing that multiple times a week or multiple times a day, but it’s not quite the same as Amazon releasing every 11 seconds. I think we can see even between these multiple large companies who have incredibly advanced release process that you could say that they are clearly at different spectrums along continuous delivery.
Edith: What do you think is suitable for a medium company with 10 developers?
Paul: There’s a lot of different things going in … Who is your customer base? What is the level of safety that you need? Let’s suppose that it’s a consumer web app. If it’s a consumer web app, I would expect that that company should be continuously delivering all the time. Especially if it’s slightly okay to break things or at least you break things, you can roll them back or whatever within a couple of minutes. A team of that size might not have the resources to make their continuous delivery process perfect, but it seems likely that they would have more success shipping with continuous delivery than shipping without it so lower risks, higher velocity, that sort of thing.
Edith: It’s funny how continuous delivery has permeated even the VC world. Jason Lemkin, you know the guy behind SaaSter, he tweeted yesterday that one of his due diligence questions was how often …
Paul: I saw this.
Edith: How often can you roll back, comma and does it work? It’s funny that I think of that as a very technical thing, can you roll back, but the VC is seen as a sign of technical sophistication.
Paul: Can you roll back?
Edith: All the time. Every day.
Paul: Really? You break things so often that you must roll back.
Edith: Paul, no. I mean, we feature flag everything.
Paul: A feature flag is not a rollback.
Edith: I think of it as a rollback. If you think of a rollback …
Paul: How often do you roll back the version of deployed code?
Edith: Virtually never because we feature flag everything, so if something …
Paul: Could you roll it back? Does it work?
Edith: That’s funny. I think we’re hedging now on what rollback means because I take it as can you take … What LaunchDarkly really does is it separates deployment from visibility.
Paul: There is a code deployment and there is a feature deploy which is done at a separate time.
Edith: When I talk about rollback in my mind I’m just talking about, I’m rolling back the feature from any users or any systems. Because if we have a feature flag on something and we turn it off, it is in effect rolled back.
Paul: I understand what you are saying. What I’m asking is, do your code rollbacks work?
Edith: We never have to do it because we always just turn off a feature flag. Stuff breaks and when it does we will just flip the feature flag off.
Paul: It’s funny we’ve ended up with a ton of different types of feature flags. One of them that’s very interesting is we started with this idea of pseudo hacks.
Edith: Isn’t a hack already a hack?
Paul: Yes. We launched this within the first couple of months of Circle that when a customer would ask for a thing, customers didn’t have pseudo at the start on Circle. They weren’t able to install things, they weren’t able to app to get install or do anything as root on a Circle box and we didn’t want to give them access to root either. What we wanted to do, whenever a customer asked for a thing we would add a single line of bash that was executed when their container started up and would get their container into whatever state it needed to be.
It would, for example, install the new version of MySql or something like that and that in effect became a feature flag. A load of customers would start to ask for particular thing, we would add the line of bash to someone’s container and then another customer would ask for it and then we’d have 10 or 20 customers who were using this in production. You can’t see it on the podcast but I am doing air quotes when I say uses this in production, and that was kind of our feature flag.
Then that evolved into global hacks so that when we have a single container image that almost all of our customers use. When we launch a new version of that container, sometimes something will break and we do lots of testing in advance and we have an automated process that ensures that those things work but occasionally something will go wrong. Then we’d add a global hack that runs on all containers before the customer’s code is run to get that back working.
We’ve evolved that. We have container feature flags, we have user feature flags, we have machine feature flags and the net effect is that we also almost never roll back. On certain things, on containers it’s very easy to roll back. On our VM images and on our front end code base it’s very, very easy to roll back. The back end code base is not as easy but it’s getting much, much easier. There is a defined process there but we almost never have to do it because there is nearly always a way to hack it and to keep rolling forward.
Edith: I did remember, so we can roll back. We try to protect everything with feature flags just because it is so much less stressful. Any time you have to do a rollback and redeploy code, it’s very stressful because there is a lot of things that could go wrong and it’s just stressful. We did have to do it the other day because we changed something around, an API schema. We hadn’t feature flagged it because we thought it was so minor and this is always where you get tripped up, the thing you thought was so minor that you didn’t feature flag that broke something very important.
Paul: Stripe has a very interesting, I don’t think you’d quite call it feature flags but it’s their API versioning, so they never change their API. They just launch a new version of the API. New customers get it and old customers can opt into it. They end up in a situation where they never have to roll back their API, they define a transition from their old API to their new API, and there are definitely code changes that sit behind each API but there isn’t a change to how a customer perceives it. Any change that the customer perceives is an actual error or is an actual bug that they need to fix.
Edith: That’s smart. That’s what we do also. APIs are so painful because no matter how much you try to warn people there is always somebody out there who is using the version …
Paul: Who is using the thing that you haven’t told them about but they inspected the data directly and thought that this was always going to be this way or something.
Edith: Oh man, when I used to work at an enterprise software company and we completely deprecated and removed an API that somebody had found out when they upgraded and could no longer use it. They wrote it like, “Where is my meta transactions?” We’re like, “Removed.” We got this really anguished … You could literally hear this guy sobbing. He’s like, “I’ve written over 3000 modules using that metatransaction API. Now what?”
Paul: What did you do?
Edith: There was nothing to be done, the API was gone.
Paul: You didn’t bring it back for him? You didn’t ship him the code for the API and, run this yourself?
Edith: Mistakes were made and he was at a very large Telecom customer that was paying us a lot of money and he was extremely unhappy with us. He was extremely unhappy for good reason because he had spent years of his time.
Paul: One of the things that we’re doing at the moment is … It’s very important to be able to sunset features and to be able to know when exactly things are working or to be able to be in control of your own code base. One of the areas that is hardest is in API because people expect that the UI is going to change all the time, that the function of everything is going to change, that they may have to change their code base sometimes to fit in with that but APIs are a particularly hard thing to change.
Especially since you don’t have any metrics on how people use them. I think that there needs to be a lot of effort if you want to have a continuous delivery to be able to change your API. A couple of things that we’re doing to change the API is one at the moment is where, right now everything is in our V1 API namespace.
Edith: Wow. Still?
Paul: Still. We just add to it, we almost never change it. We did one change, maybe two changes, customers didn’t notice so we were fortunate there. Ours is a product that where the API is not heavily used.
Edith: Oh, really? Our product is all API.
Edith: Is it public?
Paul: De facto.
Edith: That seems …
Paul: We didn’t publish it, but you can see it in Chrome in the web inspector.
Edith: Not to pick on you, though you picked on me earlier, that seems a little non-standard.
Paul: I agree that it is non-standard but it …
Edith: To put that in the politest possible way, why?
Paul: Because at the time we just took all of our stuff and put it in the v1 namespace. There was only one namespace so now …
Edith: It was like Highlander.
Paul: Now we are looking at having multiple namespaces. The two that we are focusing on are the private, which is don’t touch this and if you want it ask us, and the experimental which is we don’t actually know how this is going to be used in public and we don’t want to think about this too much. We just want to get this out the door.
Let you guys use it and give us enough feedback for us to put it in v1 at some point, or v2, whatever version it is at the point. I think that those are very useful API, the idea of namespacing and I think that the more advanced version of that is Stripe’s thing where you validate or where you make a large number of APIs. You version them and you let people exist on all of them via canonicalization process.
Edith: I agree. We take APIs very seriously because at its heart, what we are is an API. We are an API with some SDKs on top of it and then a nice dashboard. Our API is actually the most tested thing because our customers depend on that to run their business.
Paul: That’s all your business is, is a little API and a little dashboard.
Edith: Then a lot of technology to make that all work.
Paul: I keep going on about Stripe’s API versioning.
Edith: Is it because they’re Irish?
Paul: It’s not and the people who gave a talk about this are not Irish. One of the Heavybit talks was by Amber at Stripe about their API versioning. The reason that I like this so much is, have you heard the word canonicalization? This is one of my favorite software engineering techniques which doesn’t really get people talking about it, and the reason that I know about it is because it comes up a lot in compilers. The idea is that you don’t have multiple ways of representing things.
You take the multiple ways of representing things and you modify them into the canonical way of representing things. That’s how Stripe implements its API versions. Every version that the API has a canonicalization step which transforms it to the version that’s expected the next step and at the core of it, they have a single canonical version of the API that every API talks to at the end of the day.
Edith: To bring this back to continuous delivery, do you think then that they are practicing continuous delivery on their API or not?
Paul: Oh yeah, yeah. No, I think that is exactly what they are doing because they are versioning it. They are allowed, create a new version of the API. They can make a backward incompatible change at any time and it will only affect new customers going forward.
Edith: That’s a good point.
Paul: Then the only people who suffer are people like Barometrics where they need to talk to tons and tons of different customers, and actually that’s not true because you can also specify the API version that you want in the API header so you’re not actually stuck with the version that is specified in the dashboard.
Edith: Cool. We talked about APIs, we talked about consumer companies. Do you think there is any company where it’s not a good fit for continuous delivery?
Paul: SpaceX. If you are making rockets, missiles …
Edith: Cars. Cars.
Paul: Pacemakers, cars. Cars is an interesting one because you have something like the self-driving cars where they are almost assuredly doing continuous delivery. Where the ability to iterate is a key function of what will get self-driving cars to market but I wouldn’t want Ford to be continuously delivering and doing an over the air update to something that is really important.
Edith: Volkswagen was just hugely in the news because they basically hacked their software to evade the EPA.
Paul: There’s a question here as well of whether you trust software more than you trust humans.
Edith: But humans write software.
Paul: Right, right but whether you trust the humans on the ground versus the humans at Google who write software. I think it’s fair to say that where you are writing low-level C or something like that, that affects the flight of the rocket, you want to be very, very careful about what you ship and how often you ship. If you are writing this high-level AI code that’s written by the world’s greatest engineers and replaces the world’s dumbest people at the wheel of multiple thousand-pound weapons that can hit other people who are in their own weapons, I think there is definitely a case to be made where self-driving car code is not held to the same standards. Maybe not held to the same standards but is naturally of such a standard that you could expect that you would continuously deliver it.
Edith: It’s a sad story. I run a lot of Ultra marathons and a runner did a 100-mile race in Utah and then he was driving back by himself on the 80 and he just basically, to the best we could tell is he just fell asleep at the wheel and just literally on cruise control ran into the tractor trailer in front of him.
Edith: Maybe if we had smarter cars, he could have taken a nap instead of trying to drive back instead of being awake all night.
Paul: Wow, that’s a tragedy.
Edith: Then you hear cases like Volkswagen who had one of the biggest software cheats in the world.
Paul: Fortunately they are going to be fined 18 billion for it and are unlikely to do it again.
Paul: Unlikely. Yeah.
Edith: You just have to think about the logic that led them to believe that they could get away with this. What exactly they did was, in California when your emissions are measured, they turn on some special software that decrease the pollutants enough that they could pass. Then in normal driving conditions, they turn it off so they could get higher mileage and go faster.
Paul: I see. That’s lovely.
Edith: Somebody somewhere had to think that they could get away with it. It wasn’t a bug.
Paul: Continuous delivery clearly working there.
Edith: What? How?
Paul: It worked as intended. They weren’t intended to get caught but clearly that’s not the software’s fault.
Edith: Don’t blame the software, blame the human that wrote it.
Paul: Right, right, right.
Edith: Or that person’s manager, or their manager’s manager.
Paul: Or the Board. On the question of do different kinds of software lead to different kinds of delivery processes? Obviously. I often make comparisons of what level of testing do you want and the level of testing that you want for consumer software, or if you are an early stage start-up that has no customers, it’s going to be different than most start-ups and it’s going to be different than most enterprise companies and the tech stack that you use if you are going to be one of these web start-ups, you can use Ruby on Rails. If you are writing pacemakers, you want to be using software that has some kind of formal verification in it.
Edith: Then you get into the whole area of government or financial which has its own set of rules.
Paul: All their rules are around process which indicates that there is something to this process thing.
Edith: I talked to a prospect the other day who made kids’ apps and I had to turn them away, because there are so many rules around what data you can collect from children. That they can’t use a key, they can’t ask them for anything, and I said we don’t want to touch this.
Paul: That’s interesting. I will have to not ask any of my customers whether they’re building software for kids. Though we don’t keep any customer data so we would probably be fine, but best not to know I think.
Edith: It’s better not to know.
Paul: Yeah. Thanks for listening to this episode of To Be Continuous, brought to you by Heavybit and hosted by me, Paul Biggar of CircleCi, and Edith Harbaugh of LaunchDarkly.
Edith: To learn more about Heavybit, visit heavybit.com. While you are there, check out their library, home to great educational talks from other developer company founders and industry leaders.
What’s so great about continuous delivery? Find out in Episode #2 of “To Be Continuous,” a podcast brought to you by HeavyBit about continuous delivery and software development. “To Be Continuous,” is hosted by Edith Harbaugh, CEO and cofounder of LaunchDarkly and Paul Biggar, Founder of CircleCI.
Paul: Before continuous delivery, there’s often this idea that when you ship the code, you ship the product. There are rules about how one has to dogfood one’s own software.
Edith: The dirty secret of product management is there is no such thing as a professional product manager.
Paul: Mm-hmm. Right. Hi! I’m Paul Biggar, founder of CircleCI.
Edith: I’m Edith Harbaugh, CEO and co-founder at LaunchDarkly.
Paul: And you’re listening to To Be Continuous, a podcast about continuous delivery and software development.
Edith: You can get in touch with us at anytime at our Twitter handle @continuouscast.
Paul: The show is brought to you by Heavybit. To learn more, visit heavybit.com., and while you’re there, check out their library, home to great educational talks from other developer company founders and industry leaders. In this week’s episode of To be Continuous, Edith asks me, what are my favorite things about continuous delivery?
Edith: So Paul, what’s your favorite thing about continuous delivery?
Paul: So my role is partially product manager and partially engineer. And so, the two favorite things I have about continuous delivery, one is the product manager stuff which is you’re able to ship a basically broken feature to one costumer, two costumers or whatever, and do your product validation and get validation early rather than after six weeks of building the perfect feature, you get the validation after three errors and you’ve got two half-written ones will work for one costumer.
The other side of it from — as an engineer, is being able to roll out features without having to calculate every single edge case and every single costumer possibility and roll that out to a very small number of people and I guess, it’s kind of validation as well but it’s more technical validation rather than product validation. You’re able to find out, operationally, does this piece of code work rather than having to do a roll back or be worried about, have I written enough tests? Have I got enough coverage, feature flags and slow roll outs and that are kind of my personal favorite part of continuous delivery.
Edith: Yet, it’s super frustrating. I was talking to a friend who, they spent literally $3,000 in server costs testing out whether this feature would scale and then they shipped it to their customers. They found out that it didn’t matter because nobody wanted to use it. So they spent literally $3,000 on Amazon web services that were just throw away.
Paul: And the largest cost that I think exists and people ignore this cost a little bit is the cost of engineers that are involved in shipping these things. So if you build a six-week feature that took one engineer six weeks, two engineers three weeks or however it splits out and you’re paying an engineer, let’s say in San Francisco, then you just spent $90,000 on that feature.
Edith: Oh, my gosh. And people don’t think of it that way.
Paul: Right. It’s the same as meetings. No one thinks that this meeting is costing us $15,000 an hour or $1,500 or whatever it is. People think like, “Oh, we need to get this done. We need to build this feature. It needs to be the right color blue. It needs to be right. It needs to be perfect,” rather than, “The cost of this is astronomical. How can we make this cheaper? And how can we get more bang for our buck?”
Edith: Yeah. I mean, to go back to Lean is, how can we reduce waste? How can we cut all the fat out of this being built and make it more efficient? But a lot of the push-back I get around at continuous delivery and Lean is like, “Why can’t you just build it right the first time?
Paul: I think there’s a bunch of reasons you can’t build it right the first time and the first of them is the real world is complicated and the real world has error conditions that you don’t consider and it has — I guess I’m talking from a technical perspective here, but basically, if you try to push any feature, you will immediately find that it doesn’t work for a bunch of costumers for technical reasons. There’s exceptions that you weren’t expecting, clearly that’s why they’re called exceptions. There’s a bunch of bugs that are going out. There’s edge cases that you didn’t consider and then from the product side of things, there’s exactly the same things. There’s edge cases you didn’t consider. There’s customers who think about the product in a different way than you think about the product. So “Can I build it right the first time,” I think is just — it’s a sort of a naive position. It isn’t possible to build it right the first time no matter how prescient you are, no matter how good of a product visionary you are.
Edith: And what I found myself is that sometimes I would overbuild a feature and miss the mark. I thought people cared a lot more about something that it turned out they cared very a little about. And back to what you said, that’s extremely expensive not just in the raw cost of the engineer but in the opportunity cost of what you could have been building instead.
Paul: Right. Before continuous delivery, there was often this idea that when you ship the code, you ship the product. And so, people would have a marketing launch or — well, a launch, but a launch is a marketing launch, that they would aim to coincide it getting the code into the code base possibly for the first time before that launch and this is always the most disastrous thing I’ve ever seen.
Edith: I have my own stories but let me hear yours.
Paul: I don’t particularly have any great stories, just that I’ve seen it. I’ve seen people’s launches that are like, “Oh, this doesn’t work,” or people talking about their death marches and like, “We’re launching this in the morning. I need to get it finished tonight. It should have been finished last week. Or it should have been finished last month. And in like five costumers’ hands for the previous month, you should have validated. Why you’re trying to launch a thing that isn’t validated?”
Edith: Yeah. There’s this whole school of software of the death march and then tell the product managers and the marketing engineers kind of go in this death spiral of artificial deadlines and must-ships.
Paul: Right. The two work — I was going to say well — the two work horribly together like the artificial deadlines, in particular, cause — is it artificial deadlines? It’s more, I think, the–
Edith: I’ll give you an example. I was working at an enterprise software company and we had this absolute must-ship of October 31st. All summer, it seemed a long way away until suddenly, it didn’t. It’s August, September, and we’re like, “Damn. It’s on top of us,” and really, what it was was, nobody’s really waiting for October 31st. It was just that marketing didn’t want it to slip into November and then December and then have everybody be away. And we just shipped this completely half-baked thing which people cannot migrate up to because that was literally the thing we threw away. It’s like, “Okay. We don’t have time to build everything.” So we launched this thing that none of our base could use because there’s no — it wasn’t like that we have a half-baked, half-migration. It was literally we said, “Fuck them. They can’t migrate.” I was in that meeting.
Paul: Right. This is awful.
Edith: And nearly killed the company because we just burned our entire base.
Paul: So how often were your releases end up?
Edith: Oh, in that time, we were considered fast because it was once a year.
Paul: So clearly, if you were doing daily releases or weekly releases, they wouldn’t be that any of these problems at all. You could have released the thing long before October 31st. You could have been behind a feature flag in June. There could have been ten costumers in July. You could have had all ten costumers saying, “Well, we’re not going to use this. There’s no migration path,” and then realize that the migration path was the most important thing.
Edith: To what you said of the code as the product. We said it wasn’t. I think that’s why I’m such a big fan of continuous delivery because I think it forces you to be more realistic. There’s no mythical future out there. The future is now. The future is the people using your product.
Paul: I’ve seen this idea when I worked for a telecoms company that will remain nameless, that if you hit the deadline, then the fact that what you were shipping was a really bad idea wouldn’t be your responsibility.
Edith: Yeah. Garbage in, garbage out.
Paul: Something on those lines, yeah. So that marketing gives you an unreasonable deadline. Product management doesn’t push it up in front. The engineers are writing code according to someone else’s schedule. And if you write the code that goes there, then it’s not your fault that it’s going to be a disaster even though you see it be a disaster. And maybe, the people on the ground can see it being a disaster. The people at higher levels are sort of like, “I hope this isn’t a disaster and everyone thinks this is going to be awesome.”
Edith: The healthcare.gov story.
Paul: Exactly. Healthcare.gov is a prime example of this. And if there had just been a sort of a continuous delivery thing, let’s get this at its front, then what it builds is that the people on the ground end up with the same — not necessarily on the same page but the same incentives as the people in — the marketers, the project managers.
Edith: Yeah. I visited a costumer last week and they’d gotten LaunchDarkly and it was kind of funny and a little sad when they said, what they’re most excited about using feature flags was just to show features to their executives.
Paul: Wow. So they would build things that the executives ask for only for the executives?
Edith: Well, no. Right now the executives have early — look at features, over the time they saw them if they were bad. There was just all this–
Paul: That doesn’t sound so bad.
Edith: No. It’s exciting.
Paul: Yeah. That sounds good.
Edith: Yeah, because they wanted to get — because right now, the executives were at the very end of this long release process and they didn’t have any way to give them an interim peek.
Paul: There’s a big difficulty in terms of executives and management being involved in the product management process.
Edith: Yeah. I think it’s fascinating that you called yourself a product manager.
Paul: Why was that?
Edith: I mean, you’re also a founder and CEO…
Paul: Sure, sure. I’ve engineered things so that my role is far more product than — I don’t know — finance and management and that sort of thing. So the primary thing that I worry about now is, is the product good enough? How can we serve our customers’ use cases? And it ends up being a lot more like product manager than — Now, that is to say we have professional product manager so actually know what they’re doing who help with this.
Edith: The dirty secret of product management is there is no such thing as a professional product manager.
Paul: Yeah, fair. But there’s definitely people who have done it before and know how to validate and prioritize and make sure that releases don’t slip and all that sort of thing. But the idea that an executive or that a PM would only see it towards the end is a really bad idea and the idea that they’re going to completely spec it out upfront is also a really bad idea. The secret of good product which most really good project managers know and it seems that very few executives know and also very few engineers know is that it’s successfully balancing the company goals with the actual dirty details of the implementation.
So there’re engineers who will tell you that they’re in the best position to build a product because they are actually in the code. They are in the weeds. They see how things can actually work. And there’re executives who will tell you that engineers have no idea how to build the product because they don’t understand that the business case behind the product. And the truth is somewhere in the middle. The truth is that the business case is — it’s not everything but it’s a lot. If you try to formulate the business case with that understanding, the real product metrics and how people are using the product and how the code is architected and orchestrated then you can’t possibly specify the product with that data.
Edith: Yeah. And I’ll also add that as an engineer, you could fall in love with your own product. We talked about this before that you always know your product the best and you’re always thinking of some way to make this little tiny bit better when it could be something that your actual users don’t care about.
Paul: Right. There’s a bunch of features in Circle that were basically implemented the wrong way and that was because — largely because I fell in love with a particular concept or an abstraction or way of doing things. It turned out to be just confusing to customers.
Edith: And I think the only fix for this is, as you said, to share or to talk to actual customers. I’ll give an example. I talked to a LaunchDarkly customer and I thought they were going to ask for a lot of stuff but what they actually really wanted was a Slack bot integration and it wouldn’t have occurred to us at all because we use HipChat but they wanted basically a chat bot so they could roll features out to different users right inside of Slack.
Paul: So, one, that’s an excellent idea. Our feature flags are all keyed from Slack and the — I think that there’s a real danger to dogfooding. If you dogfood really, really well and you get a lot of insight from your product that way, then you don’t develop other techniques for getting feedback from your products and getting back from your customers.
Edith: You become convinced that you are the customer.
Paul: Right. And you’re never the customer.
Edith: You’re a customer.
Edith: So back to your product manager, you’re a customer with very specific needs.
Paul: And often, you are the most knowledgeable customer or, I mean, you are always the most knowledgeable. We have probably used features that none of our customers even know exist. There’s features that we built for other customers to use but for ourselves, first behind the feature flag, that actually never got launched and if they become part of our work flow, and customers are saying, “Oh, there’s some kind of thing. How do you solve this?” “How is that a problem? We don’t experience this problem. Oh, yeah we never launched that feature.”
Edith: Well, Paul, I’m going to devil’s advocate you a little bit about some of the pushback I get at continuous delivery because when you’re talking about the customer right now, the classic one is Slack which came out of Stewart Butterfield. He used chat and wanted to build a better one.
Paul: I missed the — what was the question there?
Edith: So that’s held up as an excellent product and it was all dogfooding.
Paul: Yes. So I think these are different stages of the product. There’s the — so here, I’ll give you an example. Slack has excellent onboarding. The Slack team onboarded once at most and so, in order for that to have become the amazing product it is, someone, somewhere has a job of making sure that onboarding is awesome and they are not dogfooding onboarding everyday. The whole team is not dogfooding onboarding everyday.
Edith: That would be kind of horrific if every day they have to come in and reinstall.
Paul: Exactly. And Slack has tons and tons of integrations and you add rooms and the experience to run all these things is really, really good but no doubt, Slack is only using one video integration. They’re only using — they’re not using Google Docs and Dropbox or — maybe that’s a bad example there. They’re not using the Microsoft tools and the Google tools or whatever. They’re not using multiple continuous integration services. And so, how do they know that the integrations with those services are amazing? They must have a process that lives outside of their own dogfooding.
Edith: Yeah. I totally agree. I think if you get too far from dogfooding, you end up with a product that nobody, not even the people who built it, love and that’s very dangerous. The joke is like, nobody uses the Microsoft Zune. I remember my friend — my friend who worked at Nokia who carried around an iPhone because he didn’t want to use the Nokia phones.
Paul: One of the reasons that people start to use products — this happened a lot when I was in Mozilla. People didn’t want to use Chrome at Mozilla because, well, they felt that Chrome solved problems for them. In particular, memory usage was a particular problem and for people who used a large amount of tabs, Chrome was just a better solution at that time. But there was a loyalty aspect to it. You didn’t want to use Chrome because you want to be loyal to your company or to your team or to your — the mission that you believed in. And as a result, people don’t experience other products and they don’t see how much better the world could be and they just kind of get stuck into the, “We’re using this and we’re putting up with the shit in this because we have to.”
Edith: Stockholm Syndrome.
Paul: It’s a little bit Stockholm Syndrome, definitely. Every time you tell your customer a hack, that’s a small amount of Stockholm Syndrome. You’re saying there’s a way of doing this and the way of doing this is, well, it’s a little bit hacky.
Edith: Which is not always bad because sometimes then a customer feels like they have this insider track like customers like feeling like you’re giving them something special.
Paul: There are two sides to that. So at the start, yes, they love the specialness. I talked about this in the previous week where you want your customer support team to be able to say, “Oh, thanks for the report. We just fixed that.” Or “Thanks for highlighting this used case. We just built it in the last ten minutes just for you.” And that’s great until you get about six months into using the feature and they’re still using the hack and they’re worried that it’s going to go away. And we had this with a customer recently. We built something for them — a special way to control notifications. And we didn’t want to build it into the UI because we’re going to change how the UI work and so we put it somewhere that we didn’t document and that was in an experimental section of their configuration. And then, they’re like, “Okay, this is in the experimental section. We’re worried that it’s going to go away. We’re a big customer. How do we make sure that this doesn’t go away on us?”
Edith: Yeah. That’s the way that a lot of people are using feature flags actually.
Paul: Go on.
Edith: So originally, we thought people would use feature flags just for rolling out a feature and what you said in the beginning, for rolling it out to different users, verifying that things didn’t break, things didn’t implode, the dragons didn’t come down and set fire to the universe. What we found is that people were also using feature flags for really long term controls and it was precisely that use case. It’s like you have this customer who wants access to a feature and you want a way to know that these five or six costumers have access. And that’s something that you want to know so that when you’re doing new updates, you don’t actually overwrite that but it’s not something that you want to — it’s not something you want at code level. It’s something you want a higher level.
Paul: One of the dirty secrets of enterprise software is that you have dirty hacks for customers.
Edith: Oh, my gosh. Everywhere. Everywhere.
Paul: So many people have told me that the “if” statements that they have, if customers equals Google or whoever, then we’re going to do it this way. And feature flags provide a nice little abstraction above that.
Edith: And the abstraction is — and this is what our costumers is using it for is that then at least everybody else knows, then there’s this thing in there.
Paul: So then, you also get metrics around abstractions to that so when you decide to sunset something, you should be sunsetting these features or finding better ways of doing it or something that actually solve their use case. You should be moving them away from that abstraction and having a counter, a graph, a recency indicator of when these features are used. It’s essential to actually getting those features sunsetted.
Edith: Yeah, or even out on the late. What would happen in enterprise software is you’d have — one config files somewhere that one person knew about and that person would leave.
Paul: Oh, wow.
Edith: And then, all of a sudden, everything would break and nobody even knew why, or like stuff would be always implemented at the wrong layer no matter where it’s implemented, it’s always at the wrong layer. So I’m going to play devil’s advocate again. The pushback I get is we shouldn’t ship buggy software.
Paul: I would argue with that shipping. All software is buggy. Every piece of software that you ship is buggy. You just don’t know what the bugs are yet.
Edith: So we put people to the moon. There are no bugs on that.
Paul: I mean, if you want to spend ten years or an entire decade shipping your feature, you can ship it without bugs. If you’re in a thing that can tolerate, well — let me put it this way. There was the Challenger disaster. There’s the multiple car recalls that are happening at the moment. So even people who have incredibly long cycles with strong validations and static analysis and all these things, even they can’t ship non-buggy software. Nobody can ship non-buggy software. The only reliable way of getting software that is in any way, reliable, is to ship an early version which is known buggy and if you don’t ship an early version that’s known buggy, you’re going to ship a late version that’s unknown buggy. And when you ship that, you find what are the actual things that are happening in practice.
So I shipped a feature last week. It was a long one feature. It took a lot of building and we shipped it behind a feature flag and all the tests passed and everything and I just turned it on for one customer, which is us on one branch and there was an immediate obvious bug. This is great because I didn’t ship it to 10,000 customers to discover an immediate obvious bug. I shipped it to one customer and that customer is me and I can go fix the bug now before we ship it to any more customers.
Edith: Yeah. I mean, we did the same at LaunchDarkly. We were shipping something new and we do absolutely everything with feature flags because we have to. And we found this–
Paul: There are rules about how one has to dogfood one’s own software.
Edith: Well, there was a rule and then we got really busy and we got very sloppy and we stopped doing it. And then, we tried to do a big release and it wasn’t a disaster–
Paul: I find it hilarious when continuous delivery companies do big releases. I mean, everyone does it at some point but it’s still funny.
Edith: Well, for us, it wasn’t huge. I think it was like two weeks of accumulated stuff, which for us was quite big and then–
Paul: You don’t deliver every day?
Edith: It depends on what we’re working on. So we do mini pushes but most stuff is behind the feature flag.
Paul: Of course, yeah.
Edith: So the two-week release was kind of our come to Jesus moment of like, “Wow! That was a big pain.” Because we hadn’t feature flagged stuff adequately. We had to spend a lot more time in QA which was — it was stressful.
Paul: There’s a bunch of features that — any time that there was real stress in the product team was when someone have built a feature without feature flags and someone said, “I’m not sure about this. Can we maybe test it? Can we push it out to only select customers?” And these are things that were either big product changes so people have been working on them for weeks and we’re very happy with them and really wanted to get out there. And so it led to a lot of frustration, led to just people being generally unhappy and harsh words being said and–
Paul: I mean, harsh for — still professional. Harsh as in like, I really want to ship this today. “I really feel that you shouldn’t ship this today.” That level of harsh.
Edith: If you’ve ever been in an enterprise software, there are many harsh words said and usually they’re along the lines like, “This customer promised us $750,000 this quarter. Where the fuck is my release?”
Paul: Right. So there weren’t quite those harsh words because there wasn’t any customer who promises $750,000. If there had been, then I’m sure, words could have been harsher.
Edith: I like what you say. When you descale the stakes, it gets a little bit more civilized.
Paul: Right. Developers get frustrated when they can’t ship their stuff. The only thing about — causes people to be really unhappy is the idea that they’re blocked behind a thing and they get frustrated.
Edith: Yeah. I’ve met some guys from England. They’re from Geckoboard. They’re here for a Lean meet up. And they took it to extremes. They said they broke out everything down into day-long and no longer.
Paul: So they have to ship at the end of the day?
Edith: I don’t it was at the end of the day but it couldn’t be more than a day’s worth of work. And they said this was very good for morale because everybody always saw their stuff continually getting out.
Paul: And they ship behind the feature flags?
Edith: Yeah. They have built their own system.
Paul: I guess they must just have gotten good at building a day’s unit of work.
Edith: That’s what they said. They said that they were very much into kanban and they just tried really hard to scoop it that way.
Paul: Got you. So at the start, it was difficult and they just have gotten into the groove and–
Edith: Well, for employee morale for the reasons you just gave. It’s funny because they were so bought into it. If I told them about the old days where releases took years. I think their heads would have just popped off in a very British way.
Paul: I like this idea of one day. I was reading about — this was Spotify. Spotify? No, actually this wasn’t Spotify. Whoever it was, they did — every project was a two to ten person team and two to ten weeks. Those were the rules of how things got shipped.
Edith: Yeah, Yammer did the same in the beginning.
Paul: That was Yammer.
Edith: Oh, it was?
Edith: One of the project managers, Ron, he did a guest post on our blog about how methodical they are, just everything must fit in — must fit, must be data-driven, must be hypothesized.
Paul: Oh, the hypothesis. This is an interesting thing. So every project would have a hypothesis for what the data was going to show?
Edith: So you can read more on my blog. He says it better than me but he talked about — I was trying to get people to upload more picture to Yammer and the hypothesis was that, if there were more pictures on Yammer, there would be more engagement. So they did a lot of improvements on their photo uploader to make it a lot easier to upload photos and they let the experiment run and no more photos were uploaded. So they were going to not ship the new feature just because it didn’t improve days engagement but they finally did ship it just because it got rid of some technical debt. But literally, if something doesn’t move their engagement, they don’t ship, like no matter what their feature is, if it doesn’t improve engagement…
Paul: It’s interesting for consumer companies that have to focus on engagement like that because I think that there’s a lot of B to B companies who would very sternly say, engagement isn’t as important as costumer experience or — something that’s qualitative rather than quantitative.
Edith: Yeah, they’re very strict and they’re B to B, Yammer’s kind of… Yeah.
Paul: It’s definitely on the line. Its mass B to B, has more B to C characteristics, I think, than B to B.
Edith: Yeah, but it was interesting because they are so very strict about it and they actually have the analytic groups is entirely separate department from product and engineering to keep them honest.
Paul: Oh, interesting. It’s very easy to skip the rule when — we know this one is going to work. Let’s just ship this and we don’t need test it. We don’t need to validate it.
Edith: Or we shipped it then we tested it and it didn’t improve or maybe actually made stuff a little bit worse but we have this sunk cost of we already built it so it degraded engagement by 5% but what’s 5%?
Paul: Right. So I think that’s definitely my favorite things about continuous delivery — the product validation and the technical validation.
Edith: Yeah, just doing it quicker.
Paul: Thanks for listening to this episode of To Be Continuous brought to you by Heavybit and hosted by me, Paul Biggar of CircleCI, and Edith Harbaugh of LaunchDarkly.
Edith: To learn more about Heavybit, visit heavybit.com. While you’re there, check out their library, home to great educational talks from other development company founders and industry leaders.