Software Metric Analysis of Open-Source Business Software

Abstract

Over the past decade, open-source software use has grown. Today, many companies including Google, Microsoft, Meta, RedHat, MongoDB, and Apache are major participants of open-source contributions. With the increased use of open-source software or integration of open-source software into custom-developed software, the quality of this software component increases in importance. This study examined a sample of open-source applications from GitHub. Static software analytics were conducted, and each application was classified for its risk level. In the analyzed applications, it was found that 90% of the applications were classified as low risk or moderate low risk indicating a high level of quality for open-source applications.

Share and Cite:

Butler, C. (2023) Software Metric Analysis of Open-Source Business Software. Journal of Software Engineering and Applications, 16, 144-153. doi: 10.4236/jsea.2023.165008.

1. Introduction and Objective

If software is open-source, its source code is freely available to its users. Its users have the ability to take this source code, modify it, and distribute their own versions of the program. The users also have the ability to distribute as many copies of the original program as they want. Anyone can use the software for any purpose. There are no licensing fees or other restrictions on the software. The average percentage of open-source software in codebases of proprietary applications has grown from 36% to 57% [1] . A large number of applications now contain more open-source code than proprietary code [2] . Since open-source is growing in popularity, the quality characteristics of the software are critical to companies. According to a security firm, Synopsys, at least 84% of codebases have at least one open-source vulnerability, and 48% have high-risk vulnerabilities, which have known exploits or are classified as allowing remote code execution [3] .

There are a number of key challenges of open-source software including controlling long-term maintenance, managing evolution costs, and ensuring acceptable levels of quality. This research was an exploratory study of open-source software quality. The study objective sought to answer the question, “How risky is open-source software?” The use of open-source software does not pose risks that are fundamentally different from the risks presented by the use of proprietary or self-developed software. However, the acquisition and use of open-source software necessitates implementation of unique risk management practices.

2. Methodology

In order to study open-source software, forty open-source applications (See Table 5) were downloaded from GitHub. The business nature of these applications is enterprise resource planning, accounting, scheduling, and gaming. The sample contained applications written in Java, C++, and C. The forty applications contain over four million lines of code. The applications were parsed, and static analysis using McCabe IQ was conducted. McCabe software security, quality, testing, release, and configuration management solutions and McCabe IQ have been used to analyze the security, quality, and testing of business-critical software [4] . For each application:

1) A profile was built with software metrics for the application, risk component, outlier component, and extreme complex component.

2) Software analytics were applied to generate descriptive statistical modeling.

3) A risk scorecard using supervised induction was used for data mining to identify a risk classification for each application.

Table 1 contains the family of McCabe software metrics that were used to conduct risk analysis. McCabe’s cyclomatic complexity, v(g), is a traditional software metric associated with software quality, risk, and testability. For the risk component, a software module with a v(g) > 10 is considered too complex and unreliable [5] . Ev(g) is another traditional software metric associated with software quality. It measures the structuredness of software code which is how the module’s logic conforms to single entry, single exit constructs of structure programming. In McCabe IQ, an ev(g) > 3 is considered too complex and less maintainable [5] . Table 1 also describes two additional sets of McCabe metrics. There are three design metrics, S0, S1, and iv(g). Design complexity, S0, measures the size or magnitude of a design. A larger design is considered complex and implies high levels of integration in the design. Those integration levels are measured by integration complexity, S1 and module design complexity, iv(g). Module design complexity measures low-level integration which is the integration test requirement between a superordinate module and its called immediate subordinates. Integration complexity measures the test requirement of a

Table 1. McCabe software metrics.

superordinate module and its multi-level, subtree subordinates. When integration testing requirements increase, the risk associated with the design grows [6] .

Table 1 also includes an extended set of McCabe data metrics—local, public global, and parameter data complexities. These software metrics represent extensions of cyclomatic complexity into a module’s data component. Local data complexity, ldv(g), measures the use of local data. High use of local data is a positive design concept and indicates less risk for a module since the data is not shared with other modules. Public global data complexity, pgdv(g), measures the use of global data. In contrast to local data, high use of public global data is a negative design conceptsince the data is shared among numerous other modules. Parameter data complexity, pdv(g), is the third data software metric. While parameter data complexity is a type of global data use, it is less risky due to the explicit nature of its use. Parameter data is explicitly stated in a call between a superordinate module and its subordinate modules. This approach is less risky since it is known which subordinate modules use the parameter data. In general, as the magnitude of McCabe metrics increases, the quality of the software module decreases, and the risk associated with the software module increases. The exceptions to this pattern are ldv(g) and pdv(g). As the use of data is restricted to a module or its use is explicitly stated, the quality of the software module increases, and the risk associated with the software module decreases.

For the outlier component, the study utilized Deming’s statistical quality control concepts. In addition to McCabe’s criteria, v > 10, the outlier component utilized 3σ to determine outliers in the sample. These modules would be subject to Deming’s plan-do-study-act [7] . In order to reduce the risk of complex modules, a software engineering plans to find high risk modules (plan), identifies them (do), studies their logic (study), and refactors (act) them to reduce the decision logic complexity and risk. The basis for the extreme complex component is McCabe’s categorization of cyclomatic complexity to the Department of Homeland Security [8] :

· 1 - 10: simple procedure, little risk

· 11 - 20: more complex, moderate risk

· 21 - 50: complex, high risk

· >50: untestable code, very high risk

3. Findings

The first task in this study was to parse and conduct static metric analysis using McCabe IQ. Forty applications were downloaded from GitHub for this task. For each application, a profile was built with software metrics for the total application and the risk (v > 10), outlier (v > 3σ), and extreme (v > 50) components. Table 2 contains a profile for an application named, Git. Git is written in the C language and is 193,694 lines of executable code. It is a business scheduling application.

Review the software metrics for Git in Table 2. The column labeled Profile

Table 2. Git profile.

(accented in green) represents the total application; the column labeled Risk (accented in orange) represents the portion of modules that are high risk with v(g) > 10; the column labeled Outlier (accented in blue) represents the portion of modules that are considered statistical outliers with an average ev(g) more than 3σ above the average v(g) for the application. There are 639 modules, or 9% of the application, in the risk profile (v(g) > 10). There are 123 modules in the outlier profile, or 1.7% of the application. An examination of other McCabe metrics reveals how the risk of the Git application is assessed. The average v(g) for the application is 2.2 which is below the threshold hold of poor-quality software, v(g) > 10. However, when examining the risk and outlier profiles, the average v(g) increases to 25.6 and 51.6, respectively. These components of Git exhibit low quality and high risk for testability. When examining the ev(g) measurements, Git’s application profile shows a low magnitude of 3.2. When examining the risk and outlier profiles, the average ev(g) increases to 13.0 and 24.4, respectively. These measurements of coding quality reveal low quality and high risk. A similar pattern for the remaining McCabe is observed. Module design, local data, public global data, and parameter data significantly increase in magnitude (low quality) in the risk and outlier profiles.

Regarding this pattern, the question to be explored is “How can this pattern be used to represent quality and a risk score for an application?” In order to do so, an applied classification algorithm or supervised induction was used for data mining to build a risk scorecard for each application’s risk component (v > 10). Table 3 contains risk classification criteria applied to the sample applications. Risk classification utilizes a business analytics descriptive approach. The intent is to know what is happening with an application and understand underlying trends and causes of quality level and risks.

Quartiles and interquartile range help identify spread with a subset of data. The quartile is a quarter of the number of data points given with a data set. Quartiles are determined by first scoring the data and then splitting the sorted data into four disjoint smaller data sets. In Table 3, risk classification criteria are assigned for quartiles, representing low risk, moderate low risk, moderate high risk and high-risk classifications. The first attribute used for risk classification is %nrisk. This attribute is the percentage of the application that falls into the risk range (v > 10). The higher the percentage, the greater portion of the application falls within the risk domain, and the poorer the quality and higher the risk. The

Table 3. Risk classification criteria.

second attribute (the second row in the table) used for risk classification is µvrisk. Higher v(g) values are associated with poorer testability quality and higher risk. µvrisk is divided into equal groups of 10 units each with the last grouping being open ended. The third row in the table specifies how the remaining McCabe metrics map to quality and risk. Transformations of S0, S1, ev(g), iv(g), ldv(g), pgdv(g), and pdv(g) are defined so that equal quartile separation is achieved for groups between 0 and 100%. In general, the values for these attributes imply the following:

· %S0: When %S0 increases, quality decreases and risk increases.

· %S1: When %S1 increases, quality decreases and risk increases.

· ev density: When ev/v increases, quality decreases and risk increases.

· iv density: When iv/v increases, quality decreases and risk increases.

· ldv density: When ldv/v increases, quality increases and risk decreases.

· pgdv density: When pgdv/v increases, quality decreases and risk increases.

· pdv: When pdv/v increases, quality increases and risk decreases.

The final table row defines the risk scorecard quantity for risk classification using nine attributes. Table 4 illustrates the application of the risk classification algorithm applied to Git.

Note that in the table, density ldv and density pdv are subtracted from one. This transformation is applied so that the magnitude of these attributes aligns with the other attributes. By transforming density ldv and density pdv, the higher the value, the poor the quality and the higher the risk. When any of the attributes increase, the quality of the application declines. Also, note that avg v is weighed twice. Weighting v reinforces the importance of the published research significance of v > 10 being associated with low quality and testability of software.

Table 4. Risk classification example.

4. Analysis

The above risk classification algorithm was applied to the sample applications. Table 5 contains the results. Four of the applications are classified as moderate high risk. Twenty-eight applications are classified as moderate low risk, and eight applications are classified as low risk. No applications are classified as high risk. Of the eight applications classified as low risk, seven did not contain a “risk profile” which means there are no modules with a v > 10 in these applications. If an application has no “risk profile”, it follows that it does not have “Outlier” and “v > 50” profiles. These applications are not included in Table 5.

Table 5. Open-Source application risk classification.

5. Conclusions

This exploratory study sought to assess the quality of open-source software. Forty applications were downloaded, and static software metric analysis was conducted on each application. Using McCabe software metrics, a risk scorecard algorithm was applied to assign a risk classification to each software application. In summary, the risk classification distribution for the sample is as follows:

· High risk: 0%

· Moderate high risk: 10%

· Moderate low risk: 70%

· Low risk: 20%

In conclusion, the study showed that open-source software exhibits a high degree of quality as 90% of the applications are “moderate low risk or low risk” classified. The implication is that the use of open-source software in commercial business software is a compelling high-quality approach for commercial development and deployment of applications.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References

[1] Gehman, C. (2019) How to Use Open Source Code in Proprietary Software.
https://www.perforce.com/blog/vcs/using-open-source-code-in-proprietary-software
[2] Zorz, Z. (2018) The Percentage of Open-Source Code in Proprietary Apps Is Rising. Slashdot.
https://news.slashdot.org/story/18/05/22/1727216/the-percentage-of-open-source-code-in-proprietary-apps-is-rising
[3] McKay, T. (2023) Open-Source Vulnerabilities Wide Spread in Codebases, Report Finds, IT Brew.
https://www.itbrew.com/stories/2023/03/20/open-source-vulnerabilities-widespread-in-codebases-report-finds
[4] McCabe Software (2023)
http://mccabe.com/
[5] McCabe, T.J. (1976) A Complexity Measure. IEEE Transaction on Software Engineering, SE-2, 308-320.
https://doi.org/10.1109/TSE.1976.233837
[6] McCabe, T.J. and Butler, C.W. (1989) Design Complexity Measurement and Testing. Communications of the ACM, 32, 1415-1425.
https://doi.org/10.1145/76380.76382
[7] Henshall, A. (2020) How to Use the Deming Cycle for Continuous Quality Improvement.
https://www.process.st/deming-cycle/
[8] Wikipedia (2023) Cyclomatic Complexity.
https://en.wikipedia.org/wiki/Cyclomatic_complexity

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.