SMEP: What is It and How to Beat It on Linux
On May 16, 2011, Fenghua Yu submitted a series of patches to the upstream Linux kernel implementing support for a new Intel CPU feature: Supervisor Mode Execution Protection (SMEP). This feature is enabled by toggling a bit in the cr4 register, and the result is the CPU will generate a fault whenever ring0 attempts to execute code from a page marked with the user bit.
First, some background on why this feature is useful. Like most mainstream operating systems, the vanilla Linux kernel does not leverage x86 segmentation, instead defining flat segment descriptors with limits encompassing the entire 4gb address space. Additionally, each process has the kernel’s page table entries replicated, resulting in the kernel address space being mapped in the upper 1gb of every user process. Both of these decisions are for performance reasons: reloading segment selectors at every trap and kernel-to-user (or vice versa) copy operation introduces a non-negligible (but not necessarily unacceptable) performance hit, and having completely separate user and kernel address spaces would necessitate a TLB flush on every trap, which is even more expensive.
The result of this is that the kernel is free to incorrectly access data residing in userspace, as well as execute code in the user region. In addition to enabling the exploitation of many bugs that rely on the kernel incorrectly using user data, this allows kernel exploits to simply map a suitable payload in userspace and divert kernel execution to that payload.
The PaX project solves this problem in a general way with a feature called PAX_UDEREF. When this feature is enabled, PaX leverages segmentation to isolate user and kernel addresses, such that a fault will be generated when the kernel incorrectly accesses user data or code. Unfortunately, due to the performance hit associated with reloading segment registers and the fact that this touches mission-critical code, it’s unlikely that this solution would be accepted into the upstream Linux kernel.