Dan,
I found your message very unclear.
The patch adds support for OpenMP programs like this:
And then you start talking about vcomp_fork without telling us where it comes from and what it should do. So I'll guess from the names. - vcomp_for_static/sections_init sound like startup code initialisation of BSS and DATA segments. - fork sounds like running code with thread-local copies to BSS and DATA I guess vcomp eats up one register or parameter to keep a pointer to the thread-local storage.
p_vcomp_fork(0, 1, _test_vcomp_fork_worker1, &ncalls);
Your code does not explain what the first parameter is.
I believe that va_list etc. is not going to lead you anywhere.
First, va_* is only one side. E.g. include/wine/test.h uses va_list and __builtin_ms_va_list
You need the other side too, namely, given a va_* structure, call a function X with the parameters given in that structure.
The GNU ffcall library http://www.gnu.org/software/libffcall/ distinguishes both. One half is called vacall, the other one avcall :-)
Your tests show that you need 2 different va_lists:
p_vcomp_fork(0, 5, _test_vcomp_fork_worker5, 1, 2, 3, 4, 5);
1. On the receiver side, va_start would create a va_list for the parameter set (0, 5, function pointer, 1, 2, 3, 4, 5)
2. But now you need to call something with the apparent parameter set (1, 2, 3, 4, 5) (I say apparent because you don't explain the vcomp execution model. Is a hidden parameter added somewhere? What's that first parameter 0 to fork?)
Obviously, if you don't know the internals of a va_list, you'll not be able to transform one structure into the other.
So bad, now what is actually needed?
Think assembly.
Using your test example: + p_vcomp_fork(0, 5, _test_worker5, 1, 2, 3, 4, 5);
_vcomp_fork finds data: 1. on the stack 2. in registers 3. in FP registers The C stack layout is (topmost first): 0, then 5, &_test_worker5, 1, 2, etc.
What you obviously need to do is, supposing the first parameters are duplicated in registers:
- Save registers and possibly FP registers into a structure - Remember the stack pointer and number of elements, -- hereby creating a va_args structure whose layout you know. - CreateThread?? etc. - Restore registers - Shift register parameters by 3 items: (assuming 5 such registers) register-for-param-1 (1) <- register-for-param-4 register-for-param-2 (2) <- register-for-param-5 - Copy from stack into registers register-for-param-3 (3) <- from stack register-for-param-4 (4) <- from stack register-for-param-5 (5) <- from stack - POP 0 and 5 - POP &_test_worker5 into scratch register - JMPTO _test_worker5 via scratch register
This assumes there's nothing like CreateThread so you can run from the original stack. If there were, you'd need to copy the 5 elements from the stack to the new one (which is presumably why vcomp_fork receives their number as parameter).
Does this help? Jörg Höhle
On Fri, Oct 5, 2012 at 7:27 AM, Joerg-Cyril.Hoehle@t-systems.com wrote:
I found your message very unclear.
The patch adds support for OpenMP programs like this:
And then you start talking about vcomp_fork without telling us where it comes from and what it should do.
Good point - it's unfair to expect people to run Visual C and look at its .cod / .asm listing files as my message suggests. I'll document the vcomp execution model better in my next draft.
So I'll guess from the names.
- vcomp_for_static/sections_init sound like startup code initialisation of BSS and DATA segments.
No, those are called at the start of a new parallel section by every participating thread. They set up thread-local stuff.
- fork sounds like running code with thread-local copies to BSS and DATA I guess vcomp eats up one register or parameter to keep a pointer to the thread-local storage.
It means 'Run this helper function on all cores, and pass it these parameters, which are all pointers to local variables'.
p_vcomp_fork(0, 1, _test_vcomp_fork_worker1, &ncalls);
Your code does not explain what the first parameter is.
It's a boolean saying whether to actually run in parallel, or just in the current thread.
I believe that va_list etc. is not going to lead you anywhere.
I'm going to give Maarten's suggestion a shot. If it works, the only assembly left will be a nearly verbatim copy of code in oleaut32/typelib.c
Does this help?
Yes, your questions are very helpful. I'll run the next draft by both you and Maarten if you don't mind. - Dan
On Fri, Oct 05, 2012 at 04:27:24PM +0200, Joerg-Cyril.Hoehle@t-systems.com wrote:
So bad, now what is actually needed?
Think assembly.
Using your test example:
- p_vcomp_fork(0, 5, _test_worker5, 1, 2, 3, 4, 5);
_vcomp_fork finds data:
- on the stack
- in registers
- in FP registers
The C stack layout is (topmost first): 0, then 5, &_test_worker5, 1, 2, etc.
That sort of stack layout is likely to be valid for i386, but not for amd64. 64-bit (amd64) windows uses a different calling convention from (almost) every one else. But I don't think you can convert a partially consumed va_list back into an argument list (eg to delete an initail argument) without 'cheating'.
On linux (and probably windows) the first int/ptr args are passed in integer registers and the first few FP args are passed in FP registers. When the registers run out, values are stacked.
This means that these two calls are equivalent: printf("int %d, fp %f\n", 2, 3.15159); printf("int %d, fp %f\n", 3.15159, 2); the va_arg() processing has to remember which registers have already been processed in order to know where to find the next argument.
David
New draft at https://testbot.winehq.org/JobDetails.pl?Key=22035 (the testbot seems stuck... Maarten, does it need a kick?)
Thanks to Maarten for getting me to try C varargs again; the first assembly function is now gone.
I've also added comments that explain the vcomp execution model, I hope that addresses Joerg's questions.