Hi
I've been working on a test and improvements for CloneMesh as part of GSoC 2011. I would be grateful if anyone could comment on my work. There are five test cases in the attached patch:
0. Basic mesh cloning. Declaration has position and normal, and the new declaration is the same as the original. 1. Same as test 0, but with 16-bit indices. 2. Shows that the vertex buffer can be widened so that it has room for a new vertex component. The declaration has position and normal, and the new declaration has texture coordinates after position and normal. 3. Same as test 2, but the new declaration has the texture coordinates in between position and normal. 4. Shows that a vertex component can be converted into another type if the new declaration specifies another type. The declaration has position and normal. The normal is a FLOAT2 and in the new declaration it is instead FLOAT16_2.
Test 0 and 1 tests what is already known to work in the current implementation. Test 2, 3 and 4 do not work in the current implementation because the original and new declarations are not the same.
I've also improved the current CloneMesh implementation so that it also passes test 2, 3, and 4. The patch is not complete because it does not support converting all the possible types. I expect to implement conversion for the remaining types during this week.
Again any comments or ideas are welcome.
Thanks, Michael Mc Donnell
2011/8/8 Michael Mc Donnell michael@mcdonnell.dk:
Hi
I've been working on a test and improvements for CloneMesh as part of GSoC 2011. I would be grateful if anyone could comment on my work. There are five test cases in the attached patch:
0. Basic mesh cloning. Declaration has position and normal, and the new declaration is the same as the original. 1. Same as test 0, but with 16-bit indices. 2. Shows that the vertex buffer can be widened so that it has room for a new vertex component. The declaration has position and normal, and the new declaration has texture coordinates after position and normal. 3. Same as test 2, but the new declaration has the texture coordinates in between position and normal. 4. Shows that a vertex component can be converted into another type if the new declaration specifies another type. The declaration has position and normal. The normal is a FLOAT2 and in the new declaration it is instead FLOAT16_2.
Test 0 and 1 tests what is already known to work in the current implementation. Test 2, 3 and 4 do not work in the current implementation because the original and new declarations are not the same.
I've also improved the current CloneMesh implementation so that it also passes test 2, 3, and 4. The patch is not complete because it does not support converting all the possible types. I expect to implement conversion for the remaining types during this week.
Again any comments or ideas are welcome.
Thanks, Michael Mc Donnell
Hello Michael, I have some comments:
+const UINT d3dx_decltype_size[D3DDECLTYPE_UNUSED] = +{
I don't really like sizing the vector like that. Just remove the D3DDECLTYPE_UNUSED between the square brackets, I'd say. Otherwise, you could use the D3DMAXDECLTYPE define.
+ /* D3DDECLTYPE_FLOAT1 */ 1 * 4, ...
You may want to replace the 4 with a sizeof(float) for better clarity. Same for the others.
+ if (orig_declaration.Method == declaration[i].Method + && orig_declaration.Usage == declaration[i].Usage + && orig_declaration.UsageIndex == declaration[i].UsageIndex)
Does the Method field really need to be compared? Maybe add a test if that is the case. You may also e.g. test if in a POSITION, NORMAL, TEXCOORD(0) -> POSITION, NORMAL, TEXCOORD(1) CloneMesh (or even the other way around) the texture coordinates are actually lost or not.
+ for (i = 0; orig_declaration[i].Stream != 0xff; i++) { + if (memcmp(&orig_declaration[i], &declaration[i], sizeof(*declaration))) { + same_declaration = FALSE; + break; + } + } + + if (vertex_size != orig_vertex_size) + same_declaration = FALSE;
Not really a big thing, but you could do the vertex size check first and then compare the fields only if same_declaration is still TRUE.
The tests look good to me, except some nits like the ok() in check_vertex_components in the D3DDECLTYPE_FLOAT1 case, splitting long lines and such.
On Tue, Aug 9, 2011 at 3:17 PM, Matteo Bruni matteo.mystral@gmail.com wrote:
2011/8/8 Michael Mc Donnell michael@mcdonnell.dk:
Hi
I've been working on a test and improvements for CloneMesh as part of GSoC 2011. I would be grateful if anyone could comment on my work. There are five test cases in the attached patch:
0. Basic mesh cloning. Declaration has position and normal, and the new declaration is the same as the original. 1. Same as test 0, but with 16-bit indices. 2. Shows that the vertex buffer can be widened so that it has room for a new vertex component. The declaration has position and normal, and the new declaration has texture coordinates after position and normal. 3. Same as test 2, but the new declaration has the texture coordinates in between position and normal. 4. Shows that a vertex component can be converted into another type if the new declaration specifies another type. The declaration has position and normal. The normal is a FLOAT2 and in the new declaration it is instead FLOAT16_2.
Test 0 and 1 tests what is already known to work in the current implementation. Test 2, 3 and 4 do not work in the current implementation because the original and new declarations are not the same.
I've also improved the current CloneMesh implementation so that it also passes test 2, 3, and 4. The patch is not complete because it does not support converting all the possible types. I expect to implement conversion for the remaining types during this week.
Again any comments or ideas are welcome.
Thanks, Michael Mc Donnell
Hello Michael, I have some comments:
+const UINT d3dx_decltype_size[D3DDECLTYPE_UNUSED] = +{
I don't really like sizing the vector like that. Just remove the D3DDECLTYPE_UNUSED between the square brackets, I'd say. Otherwise, you could use the D3DMAXDECLTYPE define.
- /* D3DDECLTYPE_FLOAT1 */ 1 * 4,
...
You may want to replace the 4 with a sizeof(float) for better clarity. Same for the others.
Yeah that does look better, I'll do that. It was code that I moved up higher in the source file. I hope it's ok to change it too.
- if (orig_declaration.Method == declaration[i].Method
- && orig_declaration.Usage == declaration[i].Usage
- && orig_declaration.UsageIndex == declaration[i].UsageIndex)
Does the Method field really need to be compared? Maybe add a test if that is the case.
Good catch. It doesn't do that comparison on Windows. I've added test 18 to show that and I've removed the comparison.
You may also e.g. test if in a POSITION, NORMAL, TEXCOORD(0) -> POSITION, NORMAL, TEXCOORD(1) CloneMesh (or even the other way around) the texture coordinates are actually lost or not.
I've added test 19 and 20 to show that the data is lost.
- for (i = 0; orig_declaration[i].Stream != 0xff; i++) {
- if (memcmp(&orig_declaration[i], &declaration[i],
sizeof(*declaration))) {
- same_declaration = FALSE;
- break;
- }
- }
- if (vertex_size != orig_vertex_size)
- same_declaration = FALSE;
Not really a big thing, but you could do the vertex size check first and then compare the fields only if same_declaration is still TRUE.
I just noticed that there is also a bug there. It should have been the declaration length and not the vertex size. I've extracted it out into the function declaration_equals for the sake of clarity.
The tests look good to me, except some nits like the ok() in check_vertex_components in the D3DDECLTYPE_FLOAT1 case, splitting long lines and such.
I've fixed the ok(). What do you mean by splitting long lines. Should I split them at 80 characters?
I've also implemented the rest of the FLOAT2 conversions except for UDEC3 and DEC3N which seemed to cause a memory corruption on Windows. I'm going to work next on the remaining conversions.
Thank you very much for reviewing the patch Matteo.
Cheers, Michael Mc Donnell
On Wed, Aug 10, 2011 at 7:14 PM, Michael Mc Donnell michael@mcdonnell.dk wrote:
I'm going to work next on the remaining conversions.
I just found three bugs in the FLOAT2 conversion (src->x instead of src->y) while implementing FLOAT1. I've fixed those bugs in this patch.
2011/8/11 Michael Mc Donnell michael@mcdonnell.dk:
dst_ptr[0] = src->x < 0.0f ? (SHORT)ceilf(src->x * SHRT_MAX + 0.5f) :(SHORT)floorf(src->x * SHRT_MAX + 0.5f);
You can use roundf() instead. Actually, notice that maybe what you actually need for correct rounding is rintf() (which essentially matches what D3DXFloat32To16Array does). You should probably test this (even on your own, not necessarily adding those to the testsuite, I'd say).
Still on the conversion: going on with this "strategy" is going to require a ton of code to handle every src->dst format combination. You may look into doing that in two steps: first converting src to float4, then float4 into dst. I haven't looked into it, so there may be some detail making this impractical, but still it's worth a try, I guess.
On Thu, Aug 11, 2011 at 3:01 PM, Matteo Bruni matteo.mystral@gmail.com wrote:
2011/8/11 Michael Mc Donnell michael@mcdonnell.dk:
- dst_ptr[0] = src->x < 0.0f ? (SHORT)ceilf(src->x * SHRT_MAX + 0.5f) :(SHORT)floorf(src->x * SHRT_MAX + 0.5f);
You can use roundf() instead. Actually, notice that maybe what you actually need for correct rounding is rintf() (which essentially matches what D3DXFloat32To16Array does). You should probably test this (even on your own, not necessarily adding those to the testsuite, I'd say).
Ok I'll look into that. I also noticed that I had a bug in the UBYTE4 conversion. I'll modify the tests to catch those cases.
Still on the conversion: going on with this "strategy" is going to require a ton of code to handle every src->dst format combination. You may look into doing that in two steps: first converting src to float4, then float4 into dst. I haven't looked into it, so there may be some detail making this impractical, but still it's worth a try, I guess.
That's a good idea. I had started doing the other conversions, but noticed that it was a lot of code. I'll try your idea.
Thanks!
On Thu, Aug 11, 2011 at 4:07 PM, Michael Mc Donnell michael@mcdonnell.dk wrote:
On Thu, Aug 11, 2011 at 3:01 PM, Matteo Bruni matteo.mystral@gmail.com wrote:
2011/8/11 Michael Mc Donnell michael@mcdonnell.dk:
- dst_ptr[0] = src->x < 0.0f ? (SHORT)ceilf(src->x * SHRT_MAX + 0.5f) :(SHORT)floorf(src->x * SHRT_MAX + 0.5f);
You can use roundf() instead. Actually, notice that maybe what you actually need for correct rounding is rintf() (which essentially matches what D3DXFloat32To16Array does). You should probably test this (even on your own, not necessarily adding those to the testsuite, I'd say).
Ok I'll look into that. I also noticed that I had a bug in the UBYTE4 conversion. I'll modify the tests to catch those cases.
Is it ok to use roundf and rintf? They're both C99 functions [1].
On Fri, Aug 12, 2011 at 12:59 PM, Michael Mc Donnell michael@mcdonnell.dkwrote:
Is it ok to use roundf and rintf? They're both C99 functions.
Hello,
As far as I know C99 is not allowed. However, you can emulate round by doing:
floorf(val + 0.5f)
According to [1] floorf is C99 (only floor is C89), but including math.h in wine actually gives you msvcrt's math.h which is a bit different and doesn't even have round nor rint. Gdiplus uses floorf [2] so it should be OK.
Octavian
[1] http://linux.die.net/man/3/floorf [2] http://source.winehq.org/git/wine.git/?a=search&h=HEAD&st=grep&s...
On Fri, Aug 12, 2011 at 1:06 PM, Octavian Voicu octavian.voicu@gmail.com wrote:
On Fri, Aug 12, 2011 at 12:59 PM, Michael Mc Donnell michael@mcdonnell.dk wrote:
Is it ok to use roundf and rintf? They're both C99 functions.
Hello, As far as I know C99 is not allowed. However, you can emulate round by doing: floorf(val + 0.5f) According to [1] floorf is C99 (only floor is C89), but including math.h in wine actually gives you msvcrt's math.h which is a bit different and doesn't even have round nor rint. Gdiplus uses floorf [2] so it should be OK. Octavian [1] http://linux.die.net/man/3/floorf [2] http://source.winehq.org/git/wine.git/?a=search&h=HEAD&st=grep&s...
Thanks Octavian. I was already using floorf(val + 0.5f), so I'll keep using that.
2011/8/12 Octavian Voicu octavian.voicu@gmail.com:
On Fri, Aug 12, 2011 at 12:59 PM, Michael Mc Donnell michael@mcdonnell.dk wrote:
Is it ok to use roundf and rintf? They're both C99 functions.
Hello, As far as I know C99 is not allowed. However, you can emulate round by doing: floorf(val + 0.5f) According to [1] floorf is C99 (only floor is C89), but including math.h in wine actually gives you msvcrt's math.h which is a bit different and doesn't even have round nor rint. Gdiplus uses floorf [2] so it should be OK.
No, we are still using "native" math library in wine dlls (actually, linking wine dlls to msvcrt is usually frowned upon), so those floorf calls go directly to libm. Anyway, if you look into wine msvcrt's code, floorf doesn't do anything else than directly calling native floorf (and without any #ifdef guard, as far as I can see), so it wouldn't really make a difference.
That said, I don't know what is allowed and what's not. Requiring rintf or roundf support doesn't seem really asking more than what is needed now, but you could argue that we shouldn't use floorf in wine either.
BTW, as you are actually rounding to convert a float to an integer type (as opposed to needing a rounded value still in a floating point variable), you can also avoid using floorf() altogether. So, e.g., instead of + dst_ptr[0] = src->x < 0.0f ? (SHORT)ceilf(src->x * SHRT_MAX + 0.5f) :(SHORT)floorf(src->x * SHRT_MAX + 0.5f); something like + dst_ptr[0] = src->x < 0.0f ? (SHORT)(src->x * SHRT_MAX - 0.5f) :(SHORT)(src->x * SHRT_MAX + 0.5f); still works, since C floating point to integer conversion rules mandate for truncation.
Octavian [1] http://linux.die.net/man/3/floorf [2] http://source.winehq.org/git/wine.git/?a=search&h=HEAD&st=grep&s...
On 12 August 2011 13:06, Octavian Voicu octavian.voicu@gmail.com wrote:
As far as I know C99 is not allowed. However, you can emulate round by doing: floorf(val + 0.5f)
It probably doesn't matter here, but note that that isn't the same as roundf() for values below zero.
I've changed the implementation, like suggested by Matteo and Stefan, to first convert to FLOAT4 and then to the respective types. I've implemented conversion from FLOAT1-4 to the other types, except for UDEC3 and DEC3N.
I've also changed the tests quite a bit to catch a lot of edge cases with rounding, overflow and clamping. I ended up using a simple rounding method which gives the correct behavior:
+static INT simple_round(FLOAT value) +{ + int res = (INT)(value + 0.5f); + + return res; +}
I'll continue to add more tests for the other types and implement the remaining conversions.
I've implemented conversion of the remaining types, except for UDEC3 and DEC3N, and split the patch into two. I'll rebase and send it to wine-patches once my D3DXWeldVertices patch is accepted (there is some duplication between the two).
This will be my last code for a while because the GSoC 2011 is coming to an end. I'd like to thank all of you for your reviews and comments. I would not have been able to get this far without your help.
Thanks, Michael Mc Donnell
Hi,
I noticed one small issue: On Saturday 13 August 2011 12:22:03 Michael Mc Donnell wrote:
+static HRESULT convert_vertex_buffer(ID3DXMesh *mesh_dst, ID3DXMesh
*mesh_src)
...
- hr = mesh_dst->lpVtbl->LockVertexBuffer(mesh_dst, D3DLOCK_DISCARD,
(void**)&vb_dst); On paper, D3DLOCK_DISCARD is only valid on D3DUSAGE_DYNAMIC buffers. Wine and Windows <= WinVista don't enforce this. Win7 does, and it broke some apps that way.
I noticed a few more D3DLOCK_DISCARD locks in the mesh.c code.
On Thu, Aug 18, 2011 at 8:05 PM, Stefan Dösinger stefandoesinger@gmx.at wrote:
Hi,
I noticed one small issue: On Saturday 13 August 2011 12:22:03 Michael Mc Donnell wrote:
+static HRESULT convert_vertex_buffer(ID3DXMesh *mesh_dst, ID3DXMesh
*mesh_src)
...
- hr = mesh_dst->lpVtbl->LockVertexBuffer(mesh_dst, D3DLOCK_DISCARD,
(void**)&vb_dst); On paper, D3DLOCK_DISCARD is only valid on D3DUSAGE_DYNAMIC buffers. Wine and Windows <= WinVista don't enforce this. Win7 does, and it broke some apps that way.
Ok I see that in dlls/d3d9/tests/buffers.c. So it will fail to lock on Win7 if D3DLOCK_DISCARD is specified and the vertex buffer wasn't created with D3DUSAGE_DYNAMIC?
Am I correct in D3DLOCK_DISCARD makes a copy of the vertex buffer so that the video card can keep reading the old one, and the old vertex buffer memory is released after the video card releases its lock [1]?
In that case it does not make sense to use D3DLOCK_DISCARD at all in CloneMesh, as the video card has not begun to use the new vertex buffer, and it will just add memory overhead. The easiest fix would be to just pass 0 instead of D3DLOCK_DISCARD. I've done that in the updated patch.
I noticed a few more D3DLOCK_DISCARD locks in the mesh.c code.
I guess the other methods should check the mesh options for D3DXMESH_DYNAMIC, D3DXMESH_VB_DYNAMIC, and D3DXMESH_IB_DYNAMIC [2] before using D3DLOCK_DISCARD? I'll write patches for those.
[1] http://msdn.microsoft.com/en-us/library/bb147263(v=vs.85).aspx#Using_Dynamic... [2] http://msdn.microsoft.com/en-us/library/bb205370(v=vs.85).aspx
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 19.08.2011 um 14:04 schrieb Michael Mc Donnell:
Ok I see that in dlls/d3d9/tests/buffers.c. So it will fail to lock on Win7 if D3DLOCK_DISCARD is specified and the vertex buffer wasn't created with D3DUSAGE_DYNAMIC?
Yes, unless Microsoft changed this again.
Am I correct in D3DLOCK_DISCARD makes a copy of the vertex buffer so that the video card can keep reading the old one, and the old vertex buffer memory is released after the video card releases its lock [1]?
Sort of. It doesn't copy the old data, it just gives you a fresh, uninitialized block of memory. So if you read from the pointer you get from a DISCARD lock you'll get garbage.
In that case it does not make sense to use D3DLOCK_DISCARD at all in CloneMesh, as the video card has not begun to use the new vertex buffer, and it will just add memory overhead. The easiest fix would be to just pass 0 instead of D3DLOCK_DISCARD. I've done that in the updated patch.
Right, the buffer is newly created in CloneMesh. Just not using DISCARD is the best fix then.
I noticed a few more D3DLOCK_DISCARD locks in the mesh.c code.
I guess the other methods should check the mesh options for D3DXMESH_DYNAMIC, D3DXMESH_VB_DYNAMIC, and D3DXMESH_IB_DYNAMIC [2] before using D3DLOCK_DISCARD? I'll write patches for those.
Most likely yes. I haven't looked at the details. If the buffer is new in those situations as well just don't pass any flags, otherwise pass DISCARD if you lock a dynamic buffer and are going to rewrite the entire buffer contents.