Third-party application programming interfaces (APIs), libraries, and frameworks are a fact for modern software developers. They are usually complex, rapidly evolving, and sometimes poorly documented. According to industry estimates, open-source components can represent up to 90% of the code in the average application. Of concern is the fact that API usage errors are a common source of security and reliability vulnerabilities and often difficult to detect.
To understand what a proper API call looks like, it’s necessary to understand the most common usage patterns. Traditional static analysis techniques don’t usually have the scope of code to determine what represents error-free coding practices. However, analyzing the vast amounts of code available in the open-source community provides the quantity of examples needed.
Big Code Analysis
SWAP Detector, an open source static analysis tool recently released by GrammaTech, applies Big Data analysis techniques using what we call “Big Code” analysis, to the Fedora RPM open-source repository to baseline correct API usage. This allowed us to develop error-detection capabilities that exceed the scalability and accuracy of conventional approaches to program analysis.
SWAP Detector enables developers and DevOps teams to identify errors due to swapped function arguments, which can also be present in the deployed code. SWAP Detector consumes input information about a call site, and optionally, function declaration information pertaining to that call site. If it detects a potential swapped-argument error at that call site, it outputs an appropriate warning message and a score for the warning.
The SWAP Detector interface integrates with a variety of static analysis tools, open source tools such as Clang Static Analyzer, Clang-Tidy, and PyLint and commercial tools such as CodeSonar. Although initially focused on C/C++ programs, SWAP Detector is applicable to programs in other languages. An example of a SWAP Detector warning in CodeSonar is shown below:
A SWAP Detector error imported into CodeSonar
Reducing False Positives
SWAP Detector uses multiple error-detection techniques, layered together to increase accuracy. For example, it compares argument names used in call sites with the parameter names used in corresponding declarations. In addition, it uses “Big Code” techniques, applying statistical information about usages of “known good” API-usage patterns collected from a large corpus of code, and flagging usages that are statistically anomalous as potential errors. To improve the precision of the reported warnings, SWAP Detector applies false-positive reduction strategies to the output of both techniques leading to a good yield of real errors.
SWAP Detector is available on GitHub licensed under the MIT License. For more advanced research on this work see our paper, “Out of Sight, Out of Place: Detecting and Assessing Swapped Arguments”, published in the IEEE SCAM conference.
SWAP Detector was developed based on research sponsored by the U.S. Department of Homeland Security (DHS) Science & Technology Directorate (contract numbers HHSP233201600062C, 70RSAT19C00000056). The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DHS.