On Tue, Jul 25, 2017 at 04:48:13PM -0700, Ricardo Neri wrote:
I meant to say the 4 most significant bytes. In this case, the 64-address 0xffffffffffff1234 would lie in the kernel memory while 0xffff1234 would correctly be in the user space memory.
That explanation is better.
Yes, perhaps the check above is not needed. I included that check as part of my argument validation. In a 64-bit kernel, this function could be called with val with non-zero most significant bytes.
So say that in the comment so that it is obvious *why*.
I have looked into this closely and as far as I can see, the 4 least significant bytes will wrap around when using 64-bit signed numbers as they would when using 32-bit signed numbers. For instance, for two positive numbers we have:
7fff:ffff + 7000:0000 = efff:ffff.
The addition above overflows.
Yes, MSB changes.
When sign-extended to 64-bit numbers we would have:
0000:0000:7fff:ffff + 0000:0000:7000:0000 = 0000:0000:efff:ffff.
The addition above does not overflow. However, the 4 least significant bytes overflow as we expect.
No they don't - you are simply using 64-bit regs:
0x00005555555546b8 <+8>: movq $0x7fffffff,-0x8(%rbp) 0x00005555555546c0 <+16>: movq $0x70000000,-0x10(%rbp) 0x00005555555546c8 <+24>: mov -0x8(%rbp),%rdx 0x00005555555546cc <+28>: mov -0x10(%rbp),%rax => 0x00005555555546d0 <+32>: add %rdx,%rax
rax 0xefffffff 4026531839 rbx 0x0 0 rcx 0x0 0 rdx 0x7fffffff 2147483647
...
eflags 0x206 [ PF IF ]
(OF flag is not set).
We can clamp the 4 most significant bytes.
For a two's complement negative numbers we can have:
ffff:ffff + 8000:0000 = 7fff:ffff with a carry flag.
The addition above overflows.
Yes.
When sign-extending to 64-bit numbers we would have:
ffff:ffff:ffff:ffff + ffff:ffff:8000:0000 = ffff:ffff:7fff:ffff with a carry flag.
The addition above does not overflow. However, the 4 least significant bytes overflew and wrapped around as they would when using 32-bit signed numbers.
Right. Ok.
And come to think of it now, I'm wondering, whether it would be better/easier/simpler/more straight-forward, to do the 32-bit operations with 32-bit types and separate 32-bit functions and have the hardware do that for you.
This way you can save yourself all that ugly and possibly error-prone casting back and forth and have the code much more readable too.
Hmmm.