On Mon Oct 13 21:04:16 2025 +0000, Gerald Pfeifer wrote:
> I'm afraid something's not fully done yet.
> On a FreeBSD system with hwlock installed in a non-default PREFIX
> configure runs as follows:
> checking for hwloc.h... yes
> checking for hwloc_topology_init in -lhwloc... yes
> For the build then to fail as follows:
> dlls/ntdll/unix/system.c: In function ‘traverse_hwloc_topology’:
> dlls/ntdll/unix/system.c:1406:10: error: ‘HWLOC_OBJ_L1CACHE’ undeclared
> (first use in this function); did you mean ‘HWLOC_OBJ_CACHE’?
> 1406 | case HWLOC_OBJ_L1CACHE:
> | ^~~~~~~~~~~~~~~~~
> | HWLOC_OBJ_CACHE
> dlls/ntdll/unix/system.c:1406:10: note: each undeclared identifier is
> reported only once for each function it appears in
> dlls/ntdll/unix/system.c:1407:10: error: ‘HWLOC_OBJ_L1ICACHE’ undeclared
> (first use in this function); did you mean ‘HWLOC_OBJ_CACHE’?
> 1407 | case HWLOC_OBJ_L1ICACHE:
> | ^~~~~~~~~~~~~~~~~~
> | HWLOC_OBJ_CACHE
> dlls/ntdll/unix/system.c:1410:10: error: ‘HWLOC_OBJ_L2CACHE’ undeclared
> (first use in this function); did you mean ‘HWLOC_OBJ_CACHE’?
> Looking into all of the include files installed, HWLOC_OBJ_L does not
> show up anywhere.
> Now https://www.open-mpi.org/projects/hwloc/doc/v2.3.0/a00360.php has
> the following:
> HWLOC_OBJ_CACHE replaced
> Instead of a single HWLOC_OBJ_CACHE, there are now 8 types
> HWLOC_OBJ_L1CACHE, ..., HWLOC_OBJ_L5CACHE,
> HWLOC_OBJ_L1ICACHE, ..., HWLOC_OBJ_L3ICACHE.
> So it looks as if:
> (1) configure needs to be tightened?
> (2) I should be using FreeBSD's devel/hwloc2 instead of devel/hwloc.
> <oops> Still, configure should catch this.
Yes, `devel/hwloc2` is what you need. The original seems mostly obsolete at this point, but it would be good to have configure checking for v2
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/7339#note_118433
I'm afraid something's not fully done yet.
On a FreeBSD system with hwlock installed in a non-default PREFIX configure runs as follows:
checking for hwloc.h... yes
checking for hwloc_topology_init in -lhwloc... yes
For the build then to fail as follows:
dlls/ntdll/unix/system.c: In function ‘traverse_hwloc_topology’:
dlls/ntdll/unix/system.c:1406:10: error: ‘HWLOC_OBJ_L1CACHE’ undeclared (first use in this function); did you mean ‘HWLOC_OBJ_CACHE’?
1406 | case HWLOC_OBJ_L1CACHE:
| ^~~~~~~~~~~~~~~~~
| HWLOC_OBJ_CACHE
dlls/ntdll/unix/system.c:1406:10: note: each undeclared identifier is reported only once for each function it appears in
dlls/ntdll/unix/system.c:1407:10: error: ‘HWLOC_OBJ_L1ICACHE’ undeclared (first use in this function); did you mean ‘HWLOC_OBJ_CACHE’?
1407 | case HWLOC_OBJ_L1ICACHE:
| ^~~~~~~~~~~~~~~~~~
| HWLOC_OBJ_CACHE
dlls/ntdll/unix/system.c:1410:10: error: ‘HWLOC_OBJ_L2CACHE’ undeclared (first use in this function); did you mean ‘HWLOC_OBJ_CACHE’?
Looking into all of the include files installed, HWLOC_OBJ_L does not show up anywhere.
Now https://www.open-mpi.org/projects/hwloc/doc/v2.3.0/a00360.php has the following:
HWLOC_OBJ_CACHE replaced
Instead of a single HWLOC_OBJ_CACHE, there are now 8 types HWLOC_OBJ_L1CACHE, ..., HWLOC_OBJ_L5CACHE,
HWLOC_OBJ_L1ICACHE, ..., HWLOC_OBJ_L3ICACHE.
So it looks as if:
(1) configure needs to be tightened?
(2) And is the FreeBSD's devel/hwloc really sufficient? Or is there something newer you used for development and testing?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/7339#note_118432
--
v2: mshtml: Use the common object implementation for HTMLAttributeCollection
mshtml: Use the common object implementation for HTMLStyleSheetsCollection
mshtml: Use the common object implementation for HTMLRectCollection
mshtml: Use the common object implementation for HTMLSelectElement enumerator.
mshtml: Use the common object implementation for HTMLFormElement enumerator.
mshtml: Use the common object implementation for HTMLDOMChildrenCollection
mshtml: Use a common object implementation for HTMLElementCollection
https://gitlab.winehq.org/wine/wine/-/merge_requests/9136
This uses the Mach COW mechanism to implement writewatch functionality.
Below is the same micro-benchmark @gofman used in his [UFFD MR](https://gitlab.winehq.org/wine/wine/-/merge_requests/7871).
```
Parameters:
- number of concurrent threads;
- number of pages;
- delay between reading / resetting write watches (ms)
- random (1) or sequentual (0) page write access;
- reset with WRITE_WATCH_FLAG_RESET in GetWriteWatch (1) or in a separate ResetWriteWatch call (0).
Result is in the form of <average write to page time, ns> / <average GetWriteWatch() time, mcs>
Parameters Windows Mach COW Fallback
6 1080 3 1 1 897 / 80 371 / 12634 66202 / 186
6 1080 3 1 0 855 / 87 369 / 12637 66766 / 187
8 8192 3 1 1 6526 / 268 627 / 113263 111053 / 485
8 8192 3 1 0 1197 / 509 623 / 113810 122921 / 489
8 8192 1 1 1 1227 / 412 636 / 118930 150628 / 388
8 8192 1 1 0 5721 / 144 631 / 120538 146392 / 384
8 64 1 1 1 572 / 7 490 / 1078 1000 / 89
8 64 1 1 0 530 / 13 500 / 1075 1167 / 77
```
This was all on the same M2 Max machine with Windows being win11 on ARM in a VM running the x64 binary emulated and otherwise Wine through Rosetta with and without this MR.
Unlike UFFD which is always better than fallback and comparable to the Windows performance, here good average write to page time is traded for bad average `GetWriteWatch()` time (pretty much in equal ratios).
However in real world applications (like the FFXIV + Dalamud mod framework/loader use case) the startup time is reduced from about 25.5s to 23.6s with this change from a cold start, including loading a modern dotnet 9 runtime into the game process and initializing a complex mod collection, with a fairly high GC pressure.
This is probably because the `GetWriteWatch()` calls the GC does mostly happen concurrently, whereas in Wines fallback implementation running threads are interrupted and often wait on the global virtual lock in Wine while the segfault is handled and parallel accesses to write watched memory and other VM operations are blocked.
Another advantage is that `VPROT_WRITEWATCH` can be used then for other purposes in the future and also Rosetta being a bit finicky sometimes with reported protections with the current implementation, but behaved always as expected so far in my testing with the new one.
On native ARM64 the `VM_PROT_COPY`/`SM_COW` mechanism works also as expected on native 16k pages (not that this matters much at the moment).
`GetWriteWatch()` with the reset flag also does not need to be transactional (unlike UFFD), since only marked pages are reset here and not the entire range.
--
v2: ntdll: Use Mach COW for write watches support on macOS.
https://gitlab.winehq.org/wine/wine/-/merge_requests/9090
On Mon Oct 13 09:20:43 2025 +0000, Hans Leidekker wrote:
> If you make this an array of struct metadata_stream and add the name to
> struct metadata_stream you don't need the streams array in assembly_parse_headers().
But then wouldn't we need to always iterate through the array with `strcmp` to get a particular stream (for instance, in `assembly_get_heap_size`, `assembly_get_{string, blob, guid}`)? The spec doesn't say that streams need to appear in a certain order, so we wouldn't be able to use fixed indices to this array.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/9147#note_118375