Cyclomatic Complexity

The number of unique decision paths through a function.

Interpretation

The attribute that cyclomatic complexity is expected to quantify is, as the name suggests, complexity. A file that is complex to comprehend, change, and/or test is likely to have a high value for the cyclomatic complexity metric.

Evidence

Cyclomatic Complexity has been empirically-validated to be associated with historical vulnerabilities in software in the following peer-reviewed research studies:

  1. To Fear or Not to Fear That is the Question: Code Characteristics of a Vulnerable Function with an Existing Exploit [2]

The empirical evidence overwhelming supports the notion that a source code file with high cyclomatic complexity is more likely to contain a security vulnerability.

Implications

The security implication(s) of a file having high cyclomatic complexity could be one or more of the following:

  • Complex file is likely to be harder to comprehend, change, and/or test and, as a result, may increase the potential for latent vulnerabilities.

Mitigations

The theoretical mitigation to lowering the cyclomatic complexity of a file is to have source code files only include simpler constructs with limited use of statements that alter control flow. However, the theoretical mitigation is not practical because modern software is inherently complex. Therefore, the risk of latent vulnerabilities in a file with high cyclomatic complexity could be mitigated using one or more of the following suggestions:

  • Refactor functions in the file to simplify the constructs used leveraging common design patterns when appropriate.

Implementation

In our implementation of the metric, we use SciTools Understand™ to collect the cyclomatic complexity metric from functions. The metric is aggregated at the file level by computing the sum of the cyclomatic complexity of all functions in a file.

The source code of the implementation of the metric will be made available on GitHub. If you need to collect the metric from your project, the implementation will also be made available as a container image on Docker Hub.

Languages

The metric implementation is limited to projects written in C/C++, C#, Ada, Basic, Fortran, Java, Jovial, Pascal, PL/M, Python, VHDL, Cobol, Web.

Example(s)

In this section, we present examples of the metric collected from popular open-source software projects.

Chromium

In this subsection, we present examples of the metric collected from the Chromium, the open-source project behind the Google Chrome web browser.

The metric examples presented here were collected at 6b9bf768231f commit to the master branch of the Chromium source code repository.

Summary

Chromium Cyclomatic Complexity Distribution
Figure 1.1
Chromium Cyclomatic Complexity Discriminatory
Figure 1.2

Shown in Figure 1.1 is the distribution of the metric collected from source code files in the Chromium project. Shown in Figure 1.2 is the comparison of the distribution of the metric collected from source code files in the Chromium project that were not historically vulnerable and those that were.

Thresholds

The thresholds of the metric in the Chromium project determined using the approach prescribed by Alves et al. [1] is shown in the table below.

Metric Range value < 142 142 ≤ value < 266 266 ≤ value < 785 785 ≤ value
Risk Level Low Medium High Critical

Risky Files

The thresholds are used to classify source code files into appropriate risk levels. Shown below are the top and bottom three source code files from the Chromium project in each of the three non-trivial risk levels.

Path Cyclomatic Complexity Percentile
ui/base/models/tree_node_model.h 142 70.0727
third_party/blink/renderer/core/layout/ng/ng_length_utils.cc 142 70.0727
media/filters/video_decoder_stream_unittest.cc 142 70.0727
...
third_party/sqlite/sqlite-src-3280000/mptest/mptest.c 264 79.8662
third_party/sqlite/patched/mptest/mptest.c 264 79.8662
chrome/browser/apps/guest_view/web_view_interactive_browsertest.cc 265 79.9014

Path Cyclomatic Complexity Percentile
third_party/sqlite/patched/ext/misc/zipfile.c 266 80.0281
third_party/sqlite/sqlite-src-3280000/ext/misc/zipfile.c 266 80.0281
base/debug/activity_tracker.h 266 80.0281
...
third_party/sqlite/sqlite-src-3280000/ext/fts3/fts3_write.c 775 89.8071
third_party/sqlite/patched/ext/fts3/fts3_write.c 775 89.8071
content/browser/service_worker/service_worker_storage_unittest.cc 781 89.8675

Path Cyclomatic Complexity Percentile
third_party/sqlite/sqlite-src-3280000/src/build.c 785 90.0678
third_party/sqlite/patched/src/build.c 785 90.0678
chrome/browser/push_messaging/push_messaging_browsertest.cc 795 90.1219
...
third_party/wtl/include/atlwince.h 3,618 96.6227
third_party/libxml/src/testapi.c 4,348 97.6808
third_party/sqlite/amalgamation/sqlite3.c 16,500 100

OpenBSD

In this subsection, we present examples of the metric collected from the UNIX-like operating system developed by the OpenBSD project.

The metric examples presented here were collected at dbdab68da3b commit to the master branch of the OpenBSD source code repository.

Summary

OpenBSD Cyclomatic Complexity Distribution
Figure 2.1
OpenBSD Cyclomatic Complexity Discriminatory
Figure 2.2

Shown in Figure 2.1 is the distribution of the metric collected from source code files in the OpenBSD project. Shown in Figure 2.2 is the comparison of the distribution of the metric collected from source code files in the OpenBSD project that were not historically vulnerable and those that were.

Thresholds

The thresholds of the metric in the OpenBSD project determined using the approach prescribed by Alves et al. [1] is shown in the table below.

Metric Range value < 436 436 ≤ value < 736 736 ≤ value < 1,256 1,256 ≤ value
Risk Level Low Medium High Critical

Risky Files

The thresholds are used to classify source code files into appropriate risk levels. Shown below are the top and bottom three source code files from the OpenBSD project in each of the three non-trivial risk levels.

Path Cyclomatic Complexity Percentile
gnu/gcc/gcc/config/xtensa/xtensa.c 436 70.0688
gnu/llvm/tools/clang/lib/ARCMigrate/ObjCMT.cpp 436 70.0688
gnu/usr.bin/binutils-2.17/gas/config/tc-maxq.c 436 70.0688
...
gnu/usr.bin/perl/perlio.c 730 79.8944
gnu/usr.bin/gcc/gcc/varasm.c 731 79.9376
usr.sbin/unbound/util/config_file.c 735 79.9583

Path Cyclomatic Complexity Percentile
gnu/usr.bin/binutils/gdb/mdebugread.c 736 80.0582
gnu/usr.bin/binutils-2.17/opcodes/i386-dis.c 736 80.0582
gnu/usr.bin/binutils/bfd/xcofflink.c 739 80.1587
...
usr.sbin/bind/lib/dns/resolver.c 1,239 89.8243
gnu/gcc/libstdc++-v3/include/bits/locale_facets.h 1,250 89.8436
gnu/usr.bin/binutils/bfd/elf64-ppc.c 1,254 89.9305

Path Cyclomatic Complexity Percentile
gnu/usr.bin/binutils/bfd/elflink.c 1,256 90.0125
gnu/usr.bin/gcc/gcc/config/sh/sh.c 1,275 90.0909
gnu/usr.bin/binutils/bfd/elfxx-mips.c 1,277 90.2346
...
gnu/usr.bin/gcc/gcc/testsuite/gcc.dg/20020425-1.c 11,002 99.9899
gnu/usr.bin/gcc/gcc/testsuite/gcc.dg/c99-intconst-1.c 32,113 100
gnu/llvm/tools/clang/INPUTS/c99-intconst-1.c 32,113 100

Reference(s)

[1] Tiago L. Alves, Christiaan Ypma, and Joost Visser. 2010. Deriving Metric Thresholds From Benchmark Data. In Proceedings of the 26th International Conference on Software Maintenance (ICSM '10). 1-10. https://doi.org/10.1109/ICSM.2010.5609747

[2] Awad Younis, Yashwant Malaiya, Charles Anderson, and Indrajit Ray. 2016. To Fear or Not to Fear That is the Question: Code Characteristics of a Vulnerable Function with an Existing Exploit. In Proceedings of the 6th ACM Conference on Data and Application Security and Privacy (CODASPY '16). New York, NY, USA, 97–104. https://doi.org/10.1145/2857705.2857750