Published on

Increasing product and engineering quality using Faire's weighted issue debt framework

Authors

This post was originally published on Medium by Jeff Hodnett.

Background

As organizations grow, so does the need for mechanisms to help ensure a quality product and reductions in technical debt. Rhythms that worked in a smaller team naturally break during a time of hypergrowth. As we scaled at Faire, prioritizing technical debt reduction became increasingly harder, and we didn't have solid measurements for accumulated debt. One of our core values at Faire is to seek the truth, and was one of the driving motivators behind creating our own scalable taxonomy and framework for quality, called Weighted Issue Debt.

Weighted Issue Debt: What Does This Solve?

The addition of the Weighted Issue Debt solved:

  • Visibility — we now have a high-level metric to quickly understand the overall product health rather than the "how many P0s/P1s/P2..etc teams have, how many are past SLA..etc."
  • Altitude — this metric can be applied at any altitude for the entire engineering organization and can act as a shared lens for effective conversations.
  • Accountability — expanding on the benefits of comprehension and altitude applied to the organization, we can now tie business impact to our backlog. This provides a great way to hold ourselves accountable, for example, via OKRs, to understand holistically whether we are on track.
  • Staffing — it's a great way to allow leaders to evaluate the cross-functional staffing needs to steer towards hitting goals, along with helping engineers with space & encouragement to resolve and improve quality.

Existing Quality Framework

Before the new framework, we used an existing traditional bug priority matrix with SLAs (service-level agreement) but were lacking a way to tie them together in a cohesive way. This was clear organizationally as you ladder upwards across product teams and business units and have a diluted view of overall quality. We had limited granularity with one catchall quality category, which was a Jira Bug, which encapsulated many broad numbers of issue types, such as security issues, post-mortem corrective actions & customer escalations.

Existing quality prioritization framework

Signs Of Quality Challenges

We started to see some early signs of quality challenges during significant engineering organizational growth:

  • There were system outages during key large-scale customer events, such as Faire Summer Market, that did not meet the high bar we set out to achieve.
  • Digging into defect data, we saw that around 50% of P2s miss SLA and 80% of the open defects had been open for 30+ days.
  • There was an overall increase and a lack of urgency to address post-mortem corrective actions, increasing from 9 to 130 in one year.
  • Our backlog growth began to exceed our engineering headcount growth, and our outstanding issues per engineer increased from 4.5 to 6.

Updated Quality Framework

As we welcomed a new wave of company growth, it was clear we needed to build a new framework for measuring quality and driving organization-wide alignment. We collected feedback from the various stakeholders involved in quality initiatives across the organization and crafted a new framework.

We proposed the following changes:

  1. Remove P4 as a priority: The additional level of low priority is confusing and essentially a graveyard of tasks without the right incentive to resolve. It was also confusing for stakeholders who didn't understand that P4s had a very slow time to resolution due to the low priority. We decided to simplify the process by removing this priority level.
  2. Add SLAs to all issue priorities: To help increase urgency into the entire backlog, we implemented adding SLAs to all issue priorities. In the past, we did not have SLAs associated with P3s, P4s, or unprioritized, resulting in less urgency for lower priority issues. We already notify post SLA issues of a team's slack channel, so this addresses increasing visibility to all issue priorities past SLA.
  3. Establish a new Weighted Issue Debt scoring framework: This is a framework calculation of categories that quantifies to a score. The categories included in this scoring are outlined in the next section, but it's essentially a bag of jira categories and a calculation on SLA. This scoring allowed us to ladder up values from the team level upwards to a sum at the business unit level.
  4. Create a Weighted Issue Debt goal for all of Engineering: We added a high-level north star that lives in our Engineering-wide OKR framework. Each team in the business unit can allocate OKR goals accordingly that should ladder up organizationally.

Weighted Issue Debt Framework

We needed to allocate a new value to issues at its priority to help define scores within and outside of SLA. We wanted to incentivize addressing issues within SLA, along with issue priority, so we skewed the weighting accordingly.

Here's how we set the scoring breakdown:

Weighted Issue Debt Scoring Framework

We can get the Weighted Issue Debt score by summing up the count of issues at their priority multiplied by the weighted score to derive a score based on its SLA.

Weighted Issue Debt Formula

If issues are not resolved within SLA the outside SLA score is applied, which causes the Weighted Issue score to go up significantly. For P0 issues they are immediately outside of SLA when they are created as severe issues must be resolved as quickly as possible. Mobile development is somewhat unique since we only release once a week, so for most issues the SLA is considered until merge, and we release later to the App & Google Stores.

Let's take an example of the Weighted Issue Score for a typical team:

An example breakdown of calculating a teams Weighted Issue Score

In this example, the team has a score of 8 within SLA and 22 outside of SLA so it would have a Weighted Issue Debt of 30. This team can now set a goal or create a dashboard to track progress on quality. The team can measure this on a rolling basis with daily or weekly measurements, and then aim to keep or adjust the goal for the quarter. Stepping up to the business unit level, its Weighted Issue Debt would be the sum of all the teams inside.

Quality Impact

We rolled out this new framework in Q4 of 2021 across the company, initially doing educational shareouts and creation of dashboard tooling so that teams could easily track their progress. Additionally, we established a Weighted Issue Debt goal for 2022 to help with prioritizing quality efforts. As you can see from the graph below, we have made significant progress in reducing our weighted issue debt by ~35% from the peak:

Weighted Issue Debt Dashboard for all teams in Engineering

We can also see a dramatic increase in issues being resolved within SLA:

Issue SLA Compliance Dashboard for all teams in Engineering

Future State

We will continue to improve and iterate this Weighted Issue Debt framework as we continue to enhance quality. One potential future application of this framework is to expand it by adding new categories of issues, such as product performance, and assigning weight multipliers to different categories.

I hope you found this post helpful and applicable to your Engineering team's focus around quality and frameworks to improve.