Shipping a minimum viable product is a great way for businesses to innovate quickly and rush to market. Startups are focused on developing their products fast and shouldn’t want to slow down to address technical debt. At the same time, building on poorly written code is like building on a Jenga® tower. The longer you leave it alone the sooner you are headed for the whole thing to collapse.
Making the Case For Paying Back Technical Debt Paying back technical debt can be a difficult task to get to when the business needs new features to feed the pipeline. But technical debt isn’t all or nothing. You can account for it after shipping minimum viable product, not just when the tower has become wobbly. I make sure that a percentage of all the work we do at Smartsheet is spent in modernizing and maintaining the platform. We have a product prioritization framework that lets us allocate developers, QA, and operations folks at a fixed rate based on the work that has to be done. Business support for this type of allocation isn’t the norm. For many engineering teams, the biggest challenge with technical debt is helping the rest of the business understand the value of paying it down. As a longtime advocate for reducing technical debt, I have tried a number of different strategies to articulate the value of writing code to reduce technical debt and keep my teams happy. In general, I like to look at three dimensions to make my case: scale, developer productivity (happiness), and reliability.
Scale
When building a content delivery network (CDN), the availability of capacity to scale the system is critical. If you add a new feature and it slows a single server by 1% in a fleet of 8,000 servers, that new feature costs you 80 servers worth of capacity -- a huge cost for an Infrastructure as a Service (IaaS) business. If your product is successful, scale always hits hardest. However, if you’ve invested in telemetry, it may also be the easiest to justify. The ROI on building to scale can be very clear, but an investment in measurement is critical. When I was on the engineering team at Microsoft building an internal CDN, we built a testing framework for measuring capacity of a release with synthetic traffic. This let us know if a new feature released caused a regression in capacity or scale. It also let us gage the impact of code that hurt our scale, so we knew when we needed to stop shipping and address the debt. This was a case where the ROI on building to scale was very clear and supported by the business. Developer Productivity (Happiness) When a feature is first built, we have ideas about what that feature will do. About a month later, someone has a brilliant idea to extend the feature. These extensions happen over and over, and after a while the logic and structure of the code for that feature is a mess. Just getting started on altering or fixing the feature can take a developer several hours of reading through code just to understand what they need to do. When your devs are getting started on a new feature, have them take the time to comment when they are taking a shortcut or making a mess in the code and infrastructure so if your team is able, they’ll know where to fix it. If you don’t, you’ll have to declare bankruptcy and refactor the whole mess. Also, keep in mind that there was a reason you built the feature with a minimum viable product when you started. It’s easy to try to take on everything you’ve ever seen in the business and solve it with feature extensions and enhancements. It can be helpful to go back to your original goals and find a way to balance new features and enhancements with technical debt repayment so your devs don’t have to take so much time to dive into a project. Reliability Technical debt will create bugs when you add new features, causing issues when you don’t expect them. There are a number of ways teams try to get time to fix code in this case, but they all have a cost. “Bug days” and “bug bashes” are quite popular; however, sometimes a day or a week isn’t enough time. At some point, consider taking your whole team offline for a few weeks to a month and clear as much as you can on the backlog. You can also dedicate a team to help continuously pay down your bug backlog that every developer takes a turn on. It is a clear cost that you can justify and will get you the headroom you need to scale. Our “Repayment Plan” Smartsheet’s been around since 2005, so we’re facing twelve years of legacy technical debt. That doesn’t stop us from taking on technical debt to quickly ship a minimum viable product. Recently, our team took on technical debt to ship our Notification Center early, so we had more time to test with users in our early adopter program. We followed up with about a month of work to clean up the code and make it sustainable. This is just one example of how we are constantly working to negotiate priorities and make sure that we can ship quickly and future proof new experiences to make our platform great for our customers. Source: Smartsheet Blog |