03 Apr 2018

Changing Your Engineering Culture Through People, Process and Technology

Guest post by Isaac Mosquera, Armory CTO and Co-Founder.

My co-founder DROdio likes to say Business runs on relationships, and relationships run on feelings“. It’s easy to misjudge the unseen force of human emotions when changing the culture of your organization. Culture is the operating system of the company; it’s how people behave when leadership isn’t in the room. 

The last time I was part of driving a cultural change at a large organization it took over three years to accomplish with many employees lost—both voluntary and involuntary. It’s hard work and not for the faint of heart. Most importantly, cultural change isn’t something that happens by introducing new technology or process; containers and microservices aren’t the solution to all your problems. In fact, introducing those concepts with a team that isn’t ready will just amplify your problems. 

Change always starts with people. When I was at that large organization, we used deployment velocity as our guiding metricThe chart below illustrates our journey. The story of how we got there was riddled with missteps and learning opportunities, but in the end we surpassed our highest expectations by following the People → Process → Technology framework.  

People

Does your organization really want to change? It’s likely that most people in your organization don’t notice any major issues. If they did, they probably would’ve done something about it. 

In that big change experience, I observed engineering teams that had no automated testing, no code reviews, suffered from outages, infrequent deployments and, worst of all, an ops vs devs mentality. There was complete absence of trust masked by processes that would make the DMV look innovative. In order to deploy anything into production, it took 2-3 approvals from Directors and VPs who were distant from the code.  

This culture resulted in missed deadlines, creating new services in AWS took months due to unnecessary approvals. We saw a revolving door of engineers, losing 1-2 engineers every month on a team of ~20. There was lack of ownership—when services crashed, the ops team would blame engineers. And we experienced a Not Invented Here” mentality that resulted in a myriad of home grown tools with Titanic levels of reliability.

In my attempt to address these issues, I made the common mistake of trying to solve for them before I got consensus. And I got what I deserved—I was attacked by upper management. From their perspective, all of this was a waste of their time since nothing was wrong. Also, by extension, asking to change any of these processes and tools, I was attacking their leadership skills.

Giving Your Team A Voice

The better approach was to fix what the team thought was broken. While upper management was completely ignorant to these issues, engineers in the trenches were experiencing the pain of missed deadlines, pressure from product and disappointed customers. For upper management, ignorance was bliss. 

So we asked the team what they thought. By surveying them, we were able to quantify the problems. It was no longer about me vs upper management. It was the team identifying their biggest problem and asking for change—the Ops vs Devs mentality. We were one team, and we should act like it. 

So what are some of the questions you should ask in your survey? Each organization’s surveys should be tailored to their needs. But we recommend open ended questions, since multiple choice questions typically result in leading the witness”.  Some questions you might want to consider include:

  • What are the biggest bottlenecks in getting the most time?
  • What is stopping you from collaborating with your teammates?
  • Are you clear on your weekly, monthly and yearly objectives?
  • If you were the VP of Engineering what would change?
  • What deployment tools are you using to deploy?

After summarizing your survey feedback, you’ll have a rich set of data points that nobody can argue with because they would just be arguing with the team.

Picking Your One Metric

Now comes the hard part: selecting a single metric that represents what you’re trying to change. The actual metric actually doesn’t matter, what matters more is that it’s a metric that your team can rally your team around.  

For us, the Ops vs Devs” were causing significant bottlenecks, so we chose number of deployments as our guiding metric—or as we called it deployment velocity”. When we started measuring this, it was five deployments per month. After a year of focusing on that single metric we increased it to an astounding 2,400 deployments per month.

Process

If culture is a set of shared values by your organization, then your processes are a manifestation of those values. Once we understood that the Ops vs Devs” culture was the focus, we then wanted to understand why it was there in the first place.  

As it turns out, years earlier there were even more frequent outages, so management decided to put up gates or a series of manual checklists before going to production. This resulted in developers having their access to production systems revoked because they were not trustworthy.

Trusting Your Developers

I don’t understand why engineering leaders insist on saying I trust my developers to make the right decisions,” while at the same time creating processes that prevent them from doing so and turning talented engineers unhappy. To that end, we began by reinstating production access to all developers. But this also came with a new responsibility: on-call rotation. No longer were the operations team responsible for developer mishaps. Everyone was held responsible for their own code. We immediately saw an uptick in both deployments and new services created by making that change in process.

Buy First, Then Build

We also made the decision to buy all the technology we could so development teams could focus on building differentiated software for the company. If anyone wanted to build in-house, the decision and process had to be dramatically cheaper than buying. This process change not only had an impact on our ability to add value to the company but it actually made developers much happier in their jobs.

A True DevOps Culture

What formed was a new team which was truly DevOps. The team was created by developers who truly were interested in building software to help operationalize our infrastructure and software delivery. New processes were created to get the DevOps team involved whenever necessary, but they alone were not responsible for SLA’s or uptime of developer applications. A partnership was born.

Technology

Too many engineers like to start a process like this by finding a set of new and interesting technologies, and then forcing them onto their organization. And why not? As engineers we love to tinker. In fact, I made this very mistake at the beginning. I started with technical solutions to problems only I could see, and that got me nowhere. But when I started with people and process first, the technology part became easier.  

Rewriting Our Entire Infrastructure

Though things were getting better, we weren’t out of the woods yet. We lost a core operations engineer who didn’t share the process to deploy the core website! We literally couldn’t deploy our own website. This was a catastrophic failure on the engineering team, but it was also a blessing in disguise. We were able to ask the business for 6 weeks to fix the infrastructure so we would never be in this position again. We stopped all new product development. It was a major hit to the organization, but we knew it had to get done. We ultimately moved everything to Kubernetes and had massive productivity gains and cost reduction.

Replace Homegrown with Off-the-Shelf

In this period of time we also moved everything we could into managed services by AWS or other vendors. The prediction was the bill would be incrementally larger, but in the end we actually saved money on infrastructureThis move allowed the team to focus on building interesting and valuable software instead of supporting databases like Cassandra and Kafka. 

Continuous Deployment and Feature Flagging

We also decided to rewrite our software delivery pipeline and heavily depend on Jenkins 2.0—mostly due to a suitable solution not being available like Spinnaker. We still got to throw away much of our old homegrown tooling, since the original developer was no longer there to support itAnd while this helped us gain velocity, ultimately we needed to have safer deployments—when our SLAs started decreasing, we were exposing our customers to too much risk. To address that issue, we built our homegrown feature flagging solution. This too, was because off the shelf tools like LaunchDarkly didn’t exist at the time (recall our preferred approach to build vs buy). The combination of the tooling and process allowed us to reach deployment velocity that surprised even ourselves.

Results

The chart below speaks for itself. Each new color is a new micro-service. You’ll notice we started with a few large monoliths and then quickly moved to more and more microservices because the friction to create them was so small. In that time, deployment velocity went way up! Most importantly the business benefited too. We built a whole new revenue stream due to our newfound velocity.

Approaching these changes from the people first made the rest of this transition easier and enjoyable—our team was motivated throughout the entire journey. In the end, the people, process and tools were all aligned to achieve a single goal: Deployment Velocity.

Isaac Mosquera is the CTO and Co-Founder of Armory.io. He likes building happy & productive teams.

16 Mar 2018

Hypothesis Driven Development for Software Engineers

Last week I attended the QCon London conference from Monday to Wednesday. It was a thoroughly interesting and informative three days. The sessions I heard ranged from microservice architectures to Chaos Engineering, and from how to retain the best people to new features of the Windows console. But there was one talk that really stood out to me—it took a hard look at whether we are Software Developers or Software Engineers.

QCon London is a conference for senior software engineers and architects on the patterns, practices, and use cases leveraged by the world’s most innovative software shops.”

QCon London describes itself as a software conference, not necessarily a developer conference. It focuses more on the practices of creating software as opposed to showing off the latest and greatest frameworks and languages, and how to work with them. This came through in the talks I attended where most showed very little code and focused on how we work as teams, how we can interact with tools, and the idea that how we treat our code can have a huge impact on what we ultimately deliver.

With that in mind I went to a talk titled “Taking Back ‘Software Engineering’” by Dave Farley. I wanted to understand the differences he sees between being a software developer and an engineer, and learn how those differences can help create better code. During his 50 minute presentation, Farley outlined three main phases of production we have gone through. The first was craft, where one person builds one thing. The next was mass production, which produced a lot of things but wastefully resulted in stockpiles of products that weren’t necessarily used. The final type of production was lean mass production and Just In Time (JIT) production. This is the most common form of production today and is possible because of tried and tested methodologies ensuring high quality and efficient production lines. JIT production requires a good application of the Scientific Method to be applied to enable incremental and iterative improvements that result in a high-quality product at the end.

Without this JIT production approach and the Scientific Method, Farley pointed out that NASA would never have taken humans to the moon and back. It is only through robust and repeatable experiments that NASA could understand the physics and engineering required to build a rocket that could enter Earth’s orbit, then the moon’s, land humans on the moon, and then bring them back to Earth. It’s worth noting that NASA achieved this feat within 10 years of when President John F. Kennedy declared the US would do it—at which point NASA had not yet even launched an orbital rocket successfully.

Farley surmised that engineering and the application of the Scientific Method has led to some of humanity’s greatest achievements, and yet when it comes to software there is a tendency to shy away from the title of “Engineering”. For many the title brings with it ideas of strict regulations and standards that hinder or slow creativity and progress rather than enable them. To demonstrate his point Farley asked the audience, “How many of you studied Computer Science at university?” Most of the room raised their hand. He followed up with, “How many of you were taught the Scientific Method when studying Computer Science?” There were now very few hands up.

Without a scientific approach to software development it’s perhaps an industry that follows a craft-like approach to production where because it works, it’s good enough. For example, if a software specification was given to several organisations to build, one could expect widely different levels of quality with the product. However, the same cannot be said of giving a specification for a building, car or rocket to be built—they could appear different but would be quality products based on rigorous tests and standards.

Farley went on to talk about how Test-Driven Development and Continuous Delivery are great at moving the software industry to be more scientific and rigorous in its testing standards. Though they are helping the industry to be better at engineering, there is perhaps another step needed—Hypothesis-Driven Development (HDD)—to truly move the industry to being one of engineers instead of developers.

Through HDD, theories would be created with expected outcomes before the software development aspect was even considered. This allows some robust testing to do be done further down the line, if the hypothesis stands up to the testing then it can be concluded that this appears to be correct. Further testing of the same hypothesis could be done too, allowing for repeatable tests that demonstrate the theory to be correct. The theories could be approached on a highly iterative basis, following a MVP like approach, if at any point the theory no longer holds up then the work on that feature could be stopped.

The theories wouldn’t need to come from developers and engineers themselves, although they could, but could come from other aspects of the business and stakeholders who request work to be done on the products being built. This would result in more accountability for what is being requested with a clear expectation around the success criteria.

Whilst I and my colleagues apply some of these aspects to the way we work, we don’t do everything and don’t approach working with software with such a scientific view. I can see clear benefits to using the Scientific Method when working with software. When I think about how we might better adopt this way of working I am drawn to LaunchDarkly.

We use LaunchDarkly at work for feature rollouts, changing user journeys and for A/B testing. The ease and speed of use make it a great tool for running experiments, both big and small. When I think about how we could be highly iterative with running experiments to test a hypothesis, LaunchDarkly would be an excellent way to control that test. A feature flag could be set up for a very small test with strict targeting rules, and if the results match or exceed the hypothesis then the targeting could be expanded. However, if the results are not matching what was expected, then the flag could be turned off. This approach allows for small changes to be made, with minimal amount of time and effort being spent, but for useful for results to be collected before any major investment was made into developing a feature.

I found Farley’s talk at QCon London to be an interesting and thought provoking look at how I could change the way I work with software. I’m now thinking about ways to be more scientific in how I approach my work, and I think LaunchDarkly will be a very useful tool when working with a Hypothesis Driven Development approach to software engineering.

04 Dec 2017

LaunchDarkly, #1 Feature Management Platform, Gets $21M in Series B Funding

As founder and CEO of the leading feature management platform, I’ve seen how our customers use LaunchDarkly to help them innovate more quickly, reduce risk, and break down the barriers between developer, product, marketing, and sales. Ultimately, LaunchDarkly helps our customers around the world help their own customers succeed. We take the feature flagging platform that the biggest tech giants (Facebook, Amazon, Netflix, Google) build in-house, and provide it as a service for everyone else. Thanks to a tremendous response—we serve over 10 Billion (with a B) features EVERY DAY— I’m pleased to announce that we’ve raised $21M in our Series B to accelerate our own growth in engineering, customer success, and category education.

Feature flagging/toggling is a deceptively simple idea—by separating code deployment from release with a “flag” or ”toggle”, companies can control who gets what feature in their software. LaunchDarkly allows customers to manage their feature flags at scale, giving them the in-house platform that the big tech giants have. Companies start by using LaunchDarkly for an initial “dark launch” by selectively releasing a feature to a group of their customers. This is something tech giants like Facebook and Netflix do constantly, and it allows them to manage what features we see and use in their products with minimal risk to their business. Once they become comfortable with our platform and services, the product team is able to use LaunchDarkly feature flags for fast feedback, marketing team can use LaunchDarkly for betas and launches, and sales can use us for contract management.  

LaunchDarkly is a unified platform where developer, product, marketing, sales, and customer success teams can manage code in real time. Our three main types of customers are:

Disruptors
These startups want the same feature management superpowers Facebook and Netflix have. We’ve worked with startups as they’ve grown from four people to thriving successful businesses, like Troops.AI. They’ve used us for every feature, usually starting with risk mitigation, then moving into limited rollouts, and then allowing everyone in the business to control their own features. As one startup company’s CEO told me, “We originally started using you as an “oh shoot” button, now we use you everywhere.” Another VP Product said, “Using LaunchDarkly for feature development is like night and day.”

Transitioners
These customers built their own feature management infrastructure and are tired of maintaining it. When companies like Lanetix and a leading ecommerce car buying portal made the switch to our platform, suddenly their developers could do all the things they wanted, like role based authorization and complex rule sets. And what’s more, the rest of the company can also use our tools. Now developers can focus on building and the entire company benefits from access to control and flexibility. When I was in Australia, the company told me “now when we build a feature, everyone asks ‘did you LD it?’”

Innovators
These modernizing companies know they need to move faster to innovate. They are at the forefront of their industry and know that constantly iterating will help them stay competitive. Last year, I gave a talk at NDC Sydney on how to use feature flags. An engineer from a huge IOT conglomerate immediately asked me for a demo and became a customer. This year, he gave a talk on how he’d moved from annual releases to weekly releases.

It’s extremely gratifying for our entire team at LaunchDarkly to see how much customers rely on us to run their own businesses. Customers small and large are looking to us not just as a developer tool, but as a platform that their entire company can use to deliver functionality to the right person at the right time.

While we were in Sydney, a customer’s CEO sent me a personal thank you note for a sales person visiting them and educating them on how best to use feature flags. If you’ve been in enterprise developer software, like I have, you know that usually the reaction to a salesperson visit is not kudos. However, our customers view us as a trusted advisor for expertise in feature management. I am so proud of our team and I hope our funding will help us continue to be the #1 trusted feature flag management platform, as well as invest in more education for our customers and broader market. Want to join our team? We’re hiring!

We found perfect partners in Scott Raney (Redpoint) and Jonathan Heiliger (Vertex). Scott has been a long time friend of LaunchDarkly, giving me advice and guesting on my podcast, “To Be Continuous”. Jonathan is a cloud infrastructure pioneer who is very familiar with the value LaunchDarkly provides from his time at Facebook. I’m looking forward to working closely with them both through the next chapter of LaunchDarkly.

So what’s next? LaunchDarkly has an incredibly broad base of cross-industry customers, from banking to insurance to shipping to ecommerce to hardware. The appeal of feature management is truly game-changing. Instead of code being a static object that’s changed only once a year or quarter, suddenly, code is a living, evolving power. Developers can build, marketing can launch, product can iterate, and sales can sell. Equipping businesses with the ability to move at the speed of every deploy allows an entire company to learn rapidly, deliver value to their customers faster, and produce more value. With this funding, we hope to support more customers and teach the world that there is a better way to build software—feature flagging.

*Header image credit: NASA astronaut Sunita Williams, Expedition 32 flight engineer.

12 Apr 2017

How Spinnaker and Feature Flags Together Power DevOps

It’s very common for customers to be excited about both Spinnaker (continuous delivery platform) as well as feature flags. But wait? Aren’t they both continuous delivery platforms? Yes, they are both trying to solve the same pain points – the ability to quickly get code in a repeatable, non-breaking fashion, from the hands of the developers into the arms of hopefully excited end users, with a minimal amount of pain and heartache for everyone along the toolchain. But they solve different pain points:

  • Spinnaker helps you deploy functionality to clusters of machines.
  • Feature Flags help you connect those functionality to clusters of USERS.  

Spinnaker helps with “cluster management and deployment management”. With Spinnaker, it is possible to push out code changes rapidly, sometimes hundreds (if not thousands) of times a day. As Keanu Reeves would say “Whoa.” That’s great! All code is live in production! Spinnaker even has handy tools to run black/red deployments where traffic can be shunted from cluster to cluster based on benchmarks. Dude! For those who remember the “Release to Manufacturing” days where binaries had to be put on an FTP server (and hope that someone would install and download in the next quarter or so), code being live within a few minutes of being written is amazing. For those who remember “master disks” and packaged software, this is even more amazing.

Nevertheless, with dazzling speed comes another set of problems. All code can be pushed anytime. However, many times you do not want everyone to have access to the code – you want to run a canary release on actual users, not just machines. You might want QA to try your code in production, instead of a test server with partial data. If you’re a SaaS product, you might want your best customers to get access first to get their feedback. For call center software, you want to have an opportunity to test in a few call centers. You might want to have a marketing push in a certain country days (or weeks or months) after another country. You might want to fine-tune the feature with some power users, or see how new users react to a complicated use case. All of these scenarios can not be done at a server level. This is where feature flags come in. By feature flagging, you can gate off a code path, deploy using Spinnaker, and then use a feature flag to control actual access.

Together, Spinnaker and feature flagging make an amazing combination. You can quickly get code to “production”, and from there decide who gets it, when.

21 Mar 2017

A new chapter

A new chapter

Having spent the past two plus years in the consumer space at Good Eggs, my first week at LaunchDarkly has been an absolute whirlwind experience. From getting acquainted with the development process to meeting our amazing team, the on-boarding process has been, albeit daunting, very warm and welcoming.

Getting up to speed

As part of getting up to up to speed in my first week, I’ve been diving in to the stack. John has been a fantastic pair and resource for not only the current lay of the land, but also providing valuable historical context around the product’s evolution. I still have a lot to learn, but am feeling better equipped every day.

Back and the future

Before joining the team at LaunchDarkly, I was particularly impressed with the product and its evolution. Feature flagging, to me, permeates beyond a software development process to defining how organizations—really people within organizations—work together to deliver the best possible experience to the end user. After seeing how LaunchDarkly works from inside and with its customers, I’m happy to report that I was not only right, but there is so much more to it!

I’m looking forward working on this empowering, impactful product with our incredible team and can’t wait to be a part of LaunchDarkly’s future.

14 Mar 2017

My Agile Launch

Starting at a company that helps software teams release faster with less risk has reminded me of my first foray into agile development.

One of my earliest professional experiences was as an intern at HBO, where I reported directly to the VP of Emerging Technologies.  Over the course of the summer, it became clear that there were more promising new technologies to explore than there were engineers.  The undaunted college student that I was, I convinced my boss that I should own one of these projects.  Every day I would demo my work on a screen in his office, and he would provide feedback.  A few weeks later the VP presented to senior management and the company officially green-lit the project.  Though we didn’t refer to it as such, that cycle was my introduction to Agile.  Yet more importantly, it was a daily routine where a non-technical business stakeholder provided direct feedback to a technical resource; an arrangement that I’ve come to realize is quite uncommon.

Working at large and small companies in a variety of engineering roles, there are two overall trends that I’ve identified:

  1. Companies often separate engineering from the business side.
  2. Most development-related failures (e.g. missed deadline, bad or buggy feature, release causing a security vulnerability, etc.) are a result of miscommunication.

Neither of these statements are profound in isolation but when coupled genuinely pique my curiosity.

Over time the broader engineering community has developed numerous tools and processes to mitigate risk.  Missing deadlines?  Use story points to measure team output (velocity) over time.  Releasing buggy features?  Try test-driven development.  Want to avoid downtime during a release?  Setup application performance monitoring in your staging environment.  While these “quality assurance” measures are not guarantees of perfection, they make development more predictable.

Problems arise when we cut corners in response to misaligned expectations.  Let’s say there’s a feature request for a relatively straightforward user enhancement.  The development team has done everything right to this point (features properly designed and scoped), but towards the end of the sprint the team finds a scalability issue with no clear path forward.  Engineering management notifies the business side of this issue and insists that there should be resolution within five business days.  Five days pass and the developers have made no progress.  Engineering rushes to fix the issue through a refactor but skips unit testing to release earlier.  Two new bugs slip into production.  Instead of working on the next sprint the development team now works to get a hotfix release out.  One unexpected event can change everything.

The separation of the business side from its engineering counterpart sets the stage for frustration and missed opportunities.  Any tool that can bridge the gap between these two groups offers immense value to an organization or a working relationship.  Cucumber, a Behavior-Driven Development test tool, empowers non-technical stakeholders to define requirements, in plain English, that double as executable test guidelines for engineers.  By regularly reviewing Cucumber test results, a business stakeholder could easily assess the current status of a project.  Nevertheless, Cucumber facilitates one-way communication and offers no clear guidance on iteration or state.

The daily iterations I had at HBO were extremely effective in moving the project forward quickly. In a perfect world we could repeat this process as often as possible with senior management to win their approval as early as possible.  What if we schedule a daily meeting with the entire senior management team and show up with a buggy build?  It would quickly become apparent that our good intentions are far less valuable than executive time.  Instead, what if we gave each one of the senior managers a version of the technology that we could push updates to over time?  What if I could focus on building features and my boss could choose and deploy the ones that he thought were ready?  What if after realizing that there was a problem with a deployed feature he could hide that feature without involving me?  LaunchDarkly does all of this at the enterprise level.

To me LaunchDarkly is about much more than feature flagging or even quality assurance; the platform empowers companies to reconnect the technical and non-technical departments in order to shorten feedback cycles with customers and make better decisions.

This mission is a game changer.

That its founders and team are all extraordinary yet humble makes the opportunity twice as appealing.