On Wed, 2017-06-07 at 18:28 +0200, Borislav Petkov wrote:
On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote:
Tasks running in virtual-8086 mode or in protected mode with code segment descriptors that specify 16-bit default address sizes via the D bit will use 16-bit addressing form encodings as described in the Intel 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section 2.1.5. 16-bit addressing encodings differ in several ways from the 32-bit/64-bit addressing form encodings: ModRM.rm points to different registers and, in some cases, effective addresses are indicated by the addition of the value of two registers. Also, there is no support for SIB bytes. Thus, a separate function is needed to parse this form of addressing.
A couple of functions are introduced. get_reg_offset_16() obtains the offset from the base of pt_regs of the registers indicated by the ModRM byte of the address encoding. get_addr_ref_16() computes the linear address indicated by the instructions using the value of the registers given by ModRM as well as the base address of the segment.
Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: Adam Buchbinder <adam.buchbinder(a)gmail.com> Cc: Colin Ian King <colin.king(a)canonical.com> Cc: Lorenzo Stoakes <lstoakes(a)gmail.com> Cc: Qiaowei Ren <qiaowei.ren(a)intel.com> Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com> Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Adrian Hunter <adrian.hunter(a)intel.com> Cc: Kees Cook <keescook(a)chromium.org> Cc: Thomas Garnier <thgarnie(a)google.com> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Borislav Petkov <bp(a)suse.de> Cc: Dmitry Vyukov <dvyukov(a)google.com> Cc: Ravi V. Shankar <ravi.v.shankar(a)intel.com> Cc: x86(a)kernel.org Signed-off-by: Ricardo Neri <ricardo.neri-calderon(a)linux.intel.com> --- arch/x86/lib/insn-eval.c | 155 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 155 insertions(+)
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c index 9822061..928a662 100644 --- a/arch/x86/lib/insn-eval.c +++ b/arch/x86/lib/insn-eval.c @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs, }
/** + * get_reg_offset_16 - Obtain offset of register indicated by instruction
Please end function names with parentheses.
I will correct.
+ * @insn: Instruction structure containing ModRM and SiB bytes
s/SiB/SIB/g
I will correct.
+ * @regs: Structure with register values as seen when entering kernel mode + * @offs1: Offset of the first operand register + * @offs2: Offset of the second opeand register, if applicable. + * + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte + * within insn. This function is to be used with 16-bit address encodings. The + * offs1 and offs2 will be written with the offset of the two registers + * indicated by the instruction. In cases where any of the registers is not + * referenced by the instruction, the value will be set to -EDOM. + * + * Return: 0 on success, -EINVAL on failure. + */ +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs, + int *offs1, int *offs2) +{ + /* 16-bit addressing can use one or two registers */ + static const int regoff1[] = { + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, bx), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), + offsetof(struct pt_regs, bp), + offsetof(struct pt_regs, bx), + }; + + static const int regoff2[] = { + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), + offsetof(struct pt_regs, si), + offsetof(struct pt_regs, di), + -EDOM, + -EDOM, + -EDOM, + -EDOM, + };
You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in the SDM, right?
Yes.
Please add a comment pointing to it here because it is not trivial to map that code to the documentation.
Sure, I will add a comment pointing to this table.
+ + if (!offs1 || !offs2) + return -EINVAL; + + /* operand is a register, use the generic function */ + if (X86_MODRM_MOD(insn->modrm.value) == 3) { + *offs1 = insn_get_modrm_rm_off(insn, regs); + *offs2 = -EDOM; + return 0; + } + + *offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)]; + *offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)]; + + /* + * If no displacement is indicated in the mod part of the ModRM byte,
s/"no "//
+ * (mod part is 0) and the r/m part of the same byte is 6, no register + * is used caculate the operand address. An r/m part of 6 means that + * the second register offset is already invalid.
Perhaps my comment was misleading. When ModRM.mod is 0, no displacement is used except for ModRM.mod = 0 and ModRM.rm 110b. In this case we have displacement-only addressing. I will reword the comment to reflect this fact.
+ */ + if ((X86_MODRM_MOD(insn->modrm.value) == 0) && + (X86_MODRM_RM(insn->modrm.value) == 6)) + *offs1 = -EDOM; + + return 0; +} + +/** * get_desc() - Obtain address of segment descriptor * @sel: Segment selector * @@ -689,6 +756,94 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs) }
/** + * get_addr_ref_16() - Obtain the 16-bit address referred by instruction + * @insn: Instruction structure containing ModRM byte and displacement + * @regs: Structure with register values as seen when entering kernel mode + * + * This function is to be used with 16-bit address encodings. Obtain the memory + * address referred by the instruction's ModRM bytes and displacement. Also, the + * segment used as base is determined by either any segment override prefixes in + * insn or the default segment of the registers involved in the address + * computation. In protected mode, segment limits are enforced. + * + * Return: linear address referenced by instruction and registers on success. + * -1L on failure. + */ +static void __user *get_addr_ref_16(struct insn *insn, struct pt_regs *regs) +{ + unsigned long linear_addr, seg_base_addr, seg_limit; + short eff_addr, addr1 = 0, addr2 = 0; + int addr_offset1, addr_offset2; + int ret; + + insn_get_modrm(insn); + insn_get_displacement(insn); + + /* + * If operand is a register, the layout is the same as in + * 32-bit and 64-bit addressing. + */ + if (X86_MODRM_MOD(insn->modrm.value) == 3) { + addr_offset1 = get_reg_offset(insn, regs, REG_TYPE_RM); + if (addr_offset1 < 0) + goto out_err;
<---- newline here.
Will add newline.
+ eff_addr = regs_get_register(regs, addr_offset1); + seg_base_addr = insn_get_seg_base(regs, insn, addr_offset1); + if (seg_base_addr == -1L) + goto out_err;
ditto.
Will add newline.
+ seg_limit = get_seg_limit(regs, insn, addr_offset1); + } else { + ret = get_reg_offset_16(insn, regs, &addr_offset1, + &addr_offset2); + if (ret < 0) + goto out_err;
ditto.
Will add newline.
+ /* + * Don't fail on invalid offset values. They might be invalid + * because they cannot be used for this particular value of + * the ModRM. Instead, use them in the computation only if + * they contain a valid value. + */ + if (addr_offset1 != -EDOM) + addr1 = 0xffff & regs_get_register(regs, addr_offset1); + if (addr_offset2 != -EDOM) + addr2 = 0xffff & regs_get_register(regs, addr_offset2); + eff_addr = addr1 + addr2;
ditto.
Will add newline.
Space those codelines out, we want to be able to read that code again at some point :-)))
Sure! I have gone through all this code adding newlines as necessary.
+ /* + * The first register is in the operand implies the SS or DS + * segment selectors, the second register in the operand can + * only imply DS. Thus, use the first register to obtain + * the segment selector. + */ + seg_base_addr = insn_get_seg_base(regs, insn, addr_offset1); + if (seg_base_addr == -1L) + goto out_err; + seg_limit = get_seg_limit(regs, insn, addr_offset1); + + eff_addr += (insn->displacement.value & 0xffff); + } + + linear_addr = (unsigned long)(eff_addr & 0xffff); + + /* + * Make sure the effective address is within the limits of the + * segment. In long mode, the limit is -1L. Thus, the second part
Long mode in a 16-bit handling function?
Yes, this is not correct. However, it is true for virtual-8086 mode. I will update the comment accordingly. Thanks and BR, Ricardo