This is v3 of this series. Again, it took me a while to complete the updates as support for virtual-8086 mode required extra rework. The two previous submissions can be found here [1] and here [2].
User-Mode Instruction Prevention (UMIP) is a security feature present in new Intel Processors. If enabled, it prevents the execution of certain instructions if the Current Privilege Level (CPL) is greater than 0. If these instructions were executed while in CPL > 0, user space applications could have access to system-wide settings such as the global and local descriptor tables, the segment selectors to the current task state and the local descriptor table.
These are the instructions covered by UMIP: * SGDT - Store Global Descriptor Table * SIDT - Store Interrupt Descriptor Table * SLDT - Store Local Descriptor Table * SMSW - Store Machine Status Word * STR - Store Task Register
If any of these instructions is executed with CPL > 0, a general protection exception is issued when UMIP is enabled.
There is a caveat, however. Certain applications rely on some of these instructions to function. An example of this are applications that use WineHQ[3]. For instance, these applications rely on sidt returning a non- accesible memory location[4]. During the discussions, it was proposed that the fault could be relied to the user-space and perform the emulation in user-mode. However, this would break existing applications until, for instance, they update to a new WineHQ version. However, this approach would require UMIP to be disabled by default. The concensus in this forum is to always enable it.
This patchset initially treated tasks running in virtual-8086 mode as a special case. However, I received clarification that DOSEMU[5] does not support applications that use these instructions. It relies on WineHQ for this[6]. Furthermore, the applications for which the concern was raised run in protected mode [4].
This version keeps UMIP enabled at all times and by default. If a general protection fault caused by the instructions protected by UMIP is detected, such fault will be fixed-up by returning dummy values as follows:
* SGDT and SIDT return hard-coded dummy values as the base of the global descriptor and interrupt descriptor tables. These hard-coded values are located within a memory map hole in x86_64. For x86_32, the same values are used but truncated to 4 bytes. This is also the case for virtual- 8086 mode tasks, in which the base is truncated to 3 bytes. In all cases, the limit of the table is set to 0. * STR and SLDT return 0 as the segment selector. This looks appropriate since we are providing a dummy value as the base address of the global descriptor table. * SMSW returns the value with which the CR0 register is programmed in head_32/64.S at boot time. This is, the following bits are enabed: CR0.0 for Protection Enable, CR.1 for Monitor Coprocessor, CR.4 for Extension Type, which will always be 1 in recent processors with UMIP; CR.5 for Numeric Error, CR0.16 for Write Protect, CR0.18 for Alignment Mask. Additionally, in x86_64, CR0.31 for Paging is set.
The proposed emulation code is handles faults that happens in both protected and virtual-8086 mode.
I found very useful the code for Intel MPX (Memory Protection Extensions) used to parse opcodes and the memory locations contained in the general purpose registers when used as operands. I put some of this code in a separate library file that both MPX and UMIP can access and avoid code duplication. While here, I fixed two small bugs that I found in the MPX implementation. This new library was also extended to handle 16-bit address encodings as those found in virtual-8086 mode tasks.
A set of representative selftests for virtual-8086 mode tasks are included as part of this series. The tested uses use displacements, registers to indicate memory addresses as well as the use as registers as operands.
Extensive test cases were performed to test the page fault that is emulated when memory to write the instructions results is not accesible[7], to tests almost all combinations of operands (ModRM, SiB, REX prefix and displacements) in protected mode[8] and to test almost all the combinations of operands in virtual-8086 mode[9]. If there is interest, this could could also be submitted for the kernel selftests.
[1]. https://lwn.net/Articles/705877/ [2]. https://lkml.org/lkml/2016/12/23/265 [3]. https://www.winehq.org/ [4]. https://www.winehq.org/pipermail/wine-devel/2016-November/115320.html [5]. http://www.dosemu.org/ [6]. http://marc.info/?l=linux-kernel&m=147876798717927&w=2 [7]. https://github.com/01org/luv-yocto/blob/rneri/umip/meta-luv/recipes-core/umi... [8]. https://github.com/01org/luv-yocto/blob/rneri/umip/meta-luv/recipes-core/umi... [9]. https://github.com/01org/luv-yocto/blob/rneri/umip/meta-luv/recipes-kernel/l...
Thanks and BR, Ricardo
Changes since V2: * Added new utility functions to decode the memory addresses contained in registers when the 16-bit addressing encodings are used. This includes code to obtain and compute memory addresses using segment selectors for real-mode address translation. * Added support to emulate UMIP-protected instructions for virtual-8086 tasks. * Added selftests for virtual-8086 mode that contains representative use cases: address represented as a displacement, address in registers and registers as operands. * Instead of maintaining a static variable for the dummy base addresses of the IDT and GDT, a hard-coded value is used. * The emulated SMSW instructions now return the value with which the CR0 register is programmed in head_32/64.S This is: PE | MP | ET | NE | WP | AM. For x86_64, PG is also enabled. * The new file arch/x86/lib/insn-utils.c is now renamed as arch/x86/lib/ insn-kernel.c. It also has its own header. This helps keep in sync the the kernel and objtool instruction decoders. Also, the new insn-kernel.c contains utility functions that are only relevant in a kernel context. * Removed printed warnings for errors that occur when decoding instructions with invalid operands. * Added more comments on fixes in the instruction-decoding MPX functions. * Now user_64bit_mode(regs) is used instead of test_thread_flag(TIF_IA32) to determine if the task is 32-bit or 64-bit. * Found and fixed a bug in insn-decoder in which X86_MODRM_RM was incorrectly used to obtain the mod part of the ModRM byte. * Added more explanatory code in emulation and instruction decoding code. This includes a comment regarding that copy_from_user could fail if there exists a memory protection key in place. * Tested code with CONFIG_X86_DECODER_SELFTEST=y and everything passes now. * Prefixed get_reg_offset_rm with insn_ as this function is exposed via a header file. For clarity, this function was added in a separate patch.
Changes since V1: * Virtual-8086 mode tasks are not treated in a special manner. All code for this purpose was removed. * Instead of attempting to disable UMIP during a context switch or when entering virtual-8086 mode, UMIP remains enabled all the time. General protection faults that occur are fixed-up by returning dummy values as detailed above. * Removed umip= kernel parameter in favor of using clearcpuid=514 to disable UMIP. * Removed selftests designed to detect the absence of SIGSEGV signals when running in virtual-8086 mode. * Reused code from MPX to decode instructions operands. For this purpose code was put in a common location. * Fixed two bugs in MPX code that decodes operands.
Cc: Andy Lutomirski luto@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Borislav Petkov bp@suse.de Cc: Brian Gerst brgerst@gmail.com Cc: Chen Yucong slaoub@gmail.com Cc: Chris Metcalf cmetcalf@mellanox.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: Huang Rui ray.huang@amd.com Cc: Jiri Slaby jslaby@suse.cz Cc: Jonathan Corbet corbet@lwn.net Cc: Michael S. Tsirkin mst@redhat.com Cc: Paul Gortmaker paul.gortmaker@windriver.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ravi V. Shankar ravi.v.shankar@intel.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Shuah Khan shuah@kernel.org Cc: Paolo Bonzini pbonzini@redhat.com Cc: Liang Z Li liang.z.li@intel.com Cc: x86@kernel.org Cc: linux-msdos@vger.kernel.org
Ricardo Neri (10): x86/mpx: Do not use SIB index if index points to R/ESP x86/mpx: Fail decoding when SIB baseR/EBP is and no displacement is used x86/mpx, x86/insn: Relocate insn util functions to a new insn-kernel x86/insn-kernel: Add a function to obtain register offset in ModRM x86/insn-kernel: Add support to resolve 16-bit addressing encodings x86/cpufeature: Add User-Mode Instruction Prevention definitions x86: Add emulation code for UMIP instructions x86/traps: Fixup general protection faults caused by UMIP x86: Enable User-Mode Instruction Prevention selftests/x86: Add tests for User-Mode Instruction Prevention
arch/x86/Kconfig | 10 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/disabled-features.h | 8 +- arch/x86/include/asm/insn-kernel.h | 17 ++ arch/x86/include/asm/umip.h | 15 ++ arch/x86/include/uapi/asm/processor-flags.h | 2 + arch/x86/kernel/Makefile | 1 + arch/x86/kernel/cpu/common.c | 16 +- arch/x86/kernel/traps.c | 4 + arch/x86/kernel/umip.c | 251 +++++++++++++++++++ arch/x86/lib/Makefile | 2 +- arch/x86/lib/insn-kernel.c | 344 ++++++++++++++++++++++++++ arch/x86/mm/mpx.c | 120 +-------- tools/testing/selftests/x86/entry_from_vm86.c | 39 ++- 14 files changed, 708 insertions(+), 122 deletions(-) create mode 100644 arch/x86/include/asm/insn-kernel.h create mode 100644 arch/x86/include/asm/umip.h create mode 100644 arch/x86/kernel/umip.c create mode 100644 arch/x86/lib/insn-kernel.c