On 19 August 2013 10:52, Stefan Dösinger stefan@codeweavers.com wrote:
+struct wined3d_heap_memory *wined3d_alloc_resource_mem(UINT size) +{
- struct wined3d_heap_memory *ret;
- ret = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, sizeof(*ret));
- if (!ret)
return NULL;
- ret->alloc_ptr = HeapAlloc(GetProcessHeap(), 0, size + RESOURCE_ALIGNMENT);
- if (!ret->alloc_ptr)
- {
HeapFree(GetProcessHeap(), 0, ret);
return NULL;
- }
- ret->ptr = (BYTE *)(((ULONG_PTR)ret->alloc_ptr
+ (RESOURCE_ALIGNMENT - 1)) & ~(RESOURCE_ALIGNMENT - 1));
- return ret;
+}
I don't think this is a good idea. How about something like the following? The corresponding free function would be trivial.
void *wined3d_memalign(SIZE_T alignment, SIZE_T size) { SIZE_T align; void *mem; void **p;
align = max(sizeof(*p), alignment - 1); if (!(mem = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size + align))) return NULL;
p = (void **)(((ULONG_PTR)mem + align) & ~(alignment - 1)) - 1; *p = mem;
return ++p; }
(Untested, there's probably some room for improvement.)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2013-08-19 12:29, schrieb Henri Verbeet:
I don't think this is a good idea. How about something like the following? The corresponding free function would be trivial.
I guess this works for the purposes of this patchset. However, in my command stream patches I am using a structure similar to this to cache allocated blocks of memory to handle WINED3D_MAP_DISCARD buffer maps. In that case the structure had additional size information and a struct list entry.
I guess in theory any kind of payload could be added to your implementation, but growing this larger than RESOURCE_ALIGNMENT(=16) bytes moots the point I guess. I'm not yet sure though if my cache is an actual improvement over just freeing and allocating new blocks of memory.
The other unknown is that some ddraw games write past the allocated surface memory. The extra 16 bytes took care of this, and before we aligned the memory we just allocated 4 extra bytes. I don't know which games are affected by this (it was from a comment in old ddraw), and I don't know if the problem only happened if the memory block wasn't properly aligned. Your code and mine differ when HeapAlloc happens to return an aligned pointer.
On 19 August 2013 13:38, Stefan Dösinger stefan@codeweavers.com wrote:
I guess in theory any kind of payload could be added to your implementation, but growing this larger than RESOURCE_ALIGNMENT(=16) bytes moots the point I guess.
The point is mostly that I don't think we want the wined3d_heap_memory structure to need a separate allocation. I suppose you could also just embed it inside the wined3d_resource structure.
properly aligned. Your code and mine differ when HeapAlloc happens to return an aligned pointer.
That's actually a bug in what I posted.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2013-08-19 13:53, schrieb Henri Verbeet:
On 19 August 2013 13:38, Stefan Dösinger stefan@codeweavers.com wrote:
I guess in theory any kind of payload could be added to your implementation, but growing this larger than RESOURCE_ALIGNMENT(=16) bytes moots the point I guess.
The point is mostly that I don't think we want the wined3d_heap_memory structure to need a separate allocation. I suppose you could also just embed it inside the wined3d_resource structure.
Storing the pointer inside the wined3d_resource structure won't work for the command stream, but I'll ignore those considerations until I send actual CS patches and know for sure that my cache is an actual improvement.
properly aligned. Your code and mine differ when HeapAlloc happens to return an aligned pointer.
That's actually a bug in what I posted.
What's your suggested fix? Allocating size + align + alignment, and adding alignment to mem if mem happens to be already aligned?
On 19 August 2013 14:20, Stefan Dösinger stefandoesinger@gmail.com wrote:
Am 2013-08-19 13:53, schrieb Henri Verbeet:
That's actually a bug in what I posted.
What's your suggested fix? Allocating size + align + alignment, and adding alignment to mem if mem happens to be already aligned?
I think it needs to be "align = max(sizeof(*p), alignment);", but that's actually for the aligned + 1 case.
I think we can always add some extra padding in the caller if it turns out to be needed, but I'm tempted to not worry too much about it, since it could very well be that the original comment was actually about alignment, and a regression test should give pretty obvious results if this breaks something.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2013-08-19 14:34, schrieb Henri Verbeet:
On 19 August 2013 14:20, Stefan Dösinger stefandoesinger@gmail.com wrote:
Am 2013-08-19 13:53, schrieb Henri Verbeet:
That's actually a bug in what I posted.
What's your suggested fix? Allocating size + align + alignment, and adding alignment to mem if mem happens to be already aligned?
I think it needs to be "align = max(sizeof(*p), alignment);", but that's actually for the aligned + 1 case.
With some inspiration from _aligned_offset_malloc in msvcrt I came up with this:
void *wined3d_alloc_resource_mem(SIZE_T size) { void **p; SIZE_T align = RESOURCE_ALIGNMENT - 1 + sizeof(*p); void *mem;
if (!(mem = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size + align))) return NULL;
p = (void **)(((ULONG_PTR)mem + align) & ~(RESOURCE_ALIGNMENT - 1)) - 1; *p = mem;
return ++p; }
void wined3d_free_resource_mem(void *mem) { void **p = mem;
if (!mem) return;
HeapFree(GetProcessHeap(), 0, *(--p)); }
I'm not entirely sure about the SIZE_T align = RESOURCE_ALIGNMENT - 1 + sizeof(*p); line though. Msvcrt doesn't have the - 1, and allocating size + RESOURCE_ALIGNMENT bytes might be enough. msvcrt allocates the equivalent of size + RESOURCE_ALIGNMENT + sizeof(void *).
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2013-08-19 17:08, schrieb Stefan Dösinger:
I'm not entirely sure about the SIZE_T align = RESOURCE_ALIGNMENT - 1 + sizeof(*p); line though. Msvcrt doesn't have the - 1, and allocating size + RESOURCE_ALIGNMENT bytes might be enough. msvcrt allocates the equivalent of size + RESOURCE_ALIGNMENT + sizeof(void *).
After going through all the possibilities I concluded that size + alignment + sizeof(void *) - 1 is the smallest allocation size. The worst case is a pointer of the form n * alignment - sizeof(void *) - 1. In this case we need sizeof(void *) - 1 bytes to get to the next aligned address and then alignment bytes to store our real alloc pointer.
Not sure why msvcrt allocates one extra byte. Probably because allocations below 4 or 8 bytes don't matter anyway.