Defect Density is defined as the number of defects for a specific time divided by the overall size of the component or project.
*Size is defined in either lines of code changed (LOCC) or function points (FP), depending on who you talk to.
I have been on teams that use this calculation, but I often wonder just how valuable this metric is. From what I can gather, below is a list of potential benefits from this metric.
- Allows comparison of software teams or components in order to determine quality
- Allows “high risk” teams/components to be identified (they have high defect density)
My Struggles
I am often tentative anytime the topic of metrics come up. More often than not, I think that we have too many metrics and often misread or misuse the metrics that we have. I am starting to feel this way with defect density and below are some of my largest concerns.
- Concern #1: All defect density tells you is the number of defects during a given time-frame. It does not tell you when these defects were found. Let’s compare 2 projects.
Project A
Time-frame: 1 year
Lines of Code Changed: 1,000
Defects: 10
Defect Density: 1Project B
Time-frame: 1 year
Lines of Code Changed: 1,000
Defects: 10
Defect Density: 1On the surface, these two projects look identical. But let me add some additional information.Project A
Defects Found in Months 1-3: 5
Defects Found in Months 4-6: 5
Defects Found in Months 7-9: 0
Defects Found in Months 10-12: 0Project B
Defects Found in Months 1-3: 0
Defects Found in Months 4-6: 0
Defects Found in Months 7-9: 0
Defects Found in Months 10-12: 10This now vastly changes the story between these 2 projects. Obviously this example is a little contrived, but most people would agree that Project A is better because it found the defects early and did not have defects late in the project. Project B on the other hand did not find defects early, and found all of the defects near the end of the project. I would argue that the number of defects found is not the important factor, but how soon the defects were found and resolved. Project A is most likely an iterative (agile) project that is incrementally creating/testing code, whereas project B is most likely a traditional waterfall project where it starts with long development cycles followed by long testing cycles. If it were up to me, I would want defects found sooner rather than later.
Also see a previous blog Simple Game to Demonstrate Agile Concepts: Test Small, Test Often (Jenga) to see the benefit of finding bugs early.
- All defect density tells you is the number of defects during a given time-frame. It does not tell you the severity of those defects.Building on the previous example:
Project A Defect Density: 1
Project B Defect Density: 1
Once again, they look the same. But here is some additional information:Project A
# High Severity Defects: 0
# Medium Severity Defects: 2
# Low Severity Defects: 8Project B
# High Severity Defects: 8
# Medium Severity Defects: 2
# Low Severity Defects: 0Once again, this additional information paints a very different story. Even though they had the same number of defects, the defects on Project A were of much lower severity than those on Project B.
- If the code that you are comparing spans languages, it can greatly skew the results.For example, if one project is being done in Java and another is being done in C, the number of lines of code changed will be vastly different. So the same functionality could be implemented in both languages, with the same number of defects, but the team that is using C will have a worse defect density because C is less verbose than Java. This calculation gets even more difficult when teams are utilizing a variety of technology.
Project A (Java)
Lines of Code Changed: 10,000
Defects: 10
Defect Density: .1Project B (C)
Lines of Code Changed: 1,000
Defects: 10
Defect Density: 1
Overall, I do not see a lot of value in using this metric. It may be valuable in addition to other metrics, or if used for trending of the same team/software over time, but I would stress that anyone who plans to use it needs to proceed with caution and truly understand what they are trying to measure and actually are measuring.
I would love to hear thoughts/comments on this from anyone who is or has used this metric.