Stress-testing the Linux kernel
Automating software testing allows you to run the same tests over a period of time, ensuring that you are really comparing apples to apples and oranges to oranges. In this article, Linux Test Project team members share their methodology and rationale, as well as the scripts and tools they use to stress-test the Linux® kernel.
In testing the stability of Linux kernel releases, there is a need to clearly state and document why the release is stable or unstable. And yet no documented and proven, system-wide stress test exists currently that can test the stability of the Linux kernel in its entirety. This article provides a method for creating a system-wide Linux stress test and proving the legitimacy of the results. Different Linux developers, users, and distributions use their own methods for testing kernel stability. However, information regarding the basis for their decision on which tests to run, the kernel code covered, and stress levels attained are unpublished, which greatly reduces the value of the results.
Using lab machines and tests available for Linux from the Linux Test Project test suite, we developed a combination of tests, based on system resource utilization statistics, to adequately stress the system. We analyzed this combination test to determine which sections of the Linux kernel get exercised during test execution. Afterwards, we modified the combination test to allow the highest percentage of code coverage, while maintaining the high level of system stress desired. The final result is a stress test that covers enough of the Linux kernel to be useful for stability statements, and that has the system usage and kernel code coverage data to support it.
The four steps to this combination test method are: test selection, system resource utilization evaluation, kernel code coverage analysis, and final stress test evaluation.
