Hello,
I would like to implement part of the logic within dcomp.dll in order to properly support applications that depend on it.
I found that Zhiyi Zhang has implemented some data structures and interface stubs:
https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=he...
Among these, the only method with actual logic appears to be IDCompositionDevice::Commit.
However, after reviewing the implementation, I suspect there may be an issue with the current approach.
It seems that Zhiyi Zhang starts a separate thread for each IDCompositionDevice, and this thread runs at a fixed refresh rate.
But this approach appears to introduce a problem: from the window's (HWND) perspective, there are now *two threads* concurrently updating its device context.
This results in a frame sequence similar to the following (note: the frame intervals are illustrative only):
gdi_frame | dcomp_frame | gdi_frame | gdi_frame | dcomp_frame
In this model, GDI-generated frames and DComp-generated frames are interleaved.
Is my understanding correct?
If so, I believe this method may lead to *inaccurate rendering results*, although I am not entirely certain about the exact consequences (as I am not an expert in this area).
Therefore, I have decided to pursue an approach that produces a *standardized frame stream* like the following:
gdi_mix_dcomp_frame | gdi_mix_dcomp_frame | gdi_mix_dcomp_frame
In other words, I plan to merge GDI and DComp content before presenting each frame, ensuring consistency in the final output.
I’ve reviewed the normal GDI frame refresh process, and it seems to follow this path:
--> flush_window_surfaces --> window_surface_flush --> x11drv_surface_flush
I’m planning to insert the following logic inside x11drv_surface_flush:
if (is_hwnd_bind_to_dcomp(hwnd)) { struct Surface *final = get_comp_surface(hwnd);
blend_dcomp_surface_over_gdi(dest_surface, final);
}
However, there’s still a question regarding *how to access the DComp data* inside x11drv_surface_flush.
One option is to do the composition inside IDCompositionDevice::Commit() like this: void IDCompositionDevice_Commit() { composite_proc(); ... ... push_frame_to_x11drv();
}
Another option is to *directly call into DComp APIs from within x11drv* to retrieve the surface data on demand.
However, *both approaches feel somewhat unnatural*, and from what I can tell, Windows seems to establish some kind of *internal channel* to pass the composed content to the compositor.
Is there a better or more canonical way to do this?
Any suggestions or guidance would be greatly appreciated.
Sorry, I forgot to add the title
How to Implement the Logic for Mixing DirectComposition with GDI?
On Tue, Jun 3, 2025 at 5:02 PM zhengxianwei baikaishiuc@gmail.com wrote:
Hello,
I would like to implement part of the logic within dcomp.dll in order to properly support applications that depend on it.
I found that Zhiyi Zhang has implemented some data structures and interface stubs:
https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=he...
Among these, the only method with actual logic appears to be IDCompositionDevice::Commit.
However, after reviewing the implementation, I suspect there may be an issue with the current approach.
It seems that Zhiyi Zhang starts a separate thread for each IDCompositionDevice, and this thread runs at a fixed refresh rate.
But this approach appears to introduce a problem: from the window's (HWND) perspective, there are now *two threads* concurrently updating its device context.
This results in a frame sequence similar to the following (note: the frame intervals are illustrative only):
gdi_frame | dcomp_frame | gdi_frame | gdi_frame | dcomp_frame
In this model, GDI-generated frames and DComp-generated frames are interleaved.
Is my understanding correct?
If so, I believe this method may lead to *inaccurate rendering results*, although I am not entirely certain about the exact consequences (as I am not an expert in this area).
Therefore, I have decided to pursue an approach that produces a *standardized frame stream* like the following:
gdi_mix_dcomp_frame | gdi_mix_dcomp_frame | gdi_mix_dcomp_frame
In other words, I plan to merge GDI and DComp content before presenting each frame, ensuring consistency in the final output.
I’ve reviewed the normal GDI frame refresh process, and it seems to follow this path:
--> flush_window_surfaces --> window_surface_flush --> x11drv_surface_flush
I’m planning to insert the following logic inside x11drv_surface_flush:
if (is_hwnd_bind_to_dcomp(hwnd)) { struct Surface *final = get_comp_surface(hwnd);
blend_dcomp_surface_over_gdi(dest_surface, final);
}
However, there’s still a question regarding *how to access the DComp data* inside x11drv_surface_flush.
One option is to do the composition inside IDCompositionDevice::Commit() like this: void IDCompositionDevice_Commit() { composite_proc(); ... ... push_frame_to_x11drv();
}
Another option is to *directly call into DComp APIs from within x11drv* to retrieve the surface data on demand.
However, *both approaches feel somewhat unnatural*, and from what I can tell, Windows seems to establish some kind of *internal channel* to pass the composed content to the compositor.
Is there a better or more canonical way to do this?
Any suggestions or guidance would be greatly appreciated.
On 6/3/25 11:02, zhengxianwei wrote:
Hello,
I would like to implement part of the logic within dcomp.dll in order to properly support applications that depend on it.
I found that Zhiyi Zhang has implemented some data structures and interface stubs:
https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=he...
Among these, the only method with actual logic appears to be IDCompositionDevice::Commit.
However, after reviewing the implementation, I suspect there may be an issue with the current approach.
It seems that Zhiyi Zhang starts a separate thread for each IDCompositionDevice, and this thread runs at a fixed refresh rate.
But this approach appears to introduce a problem: from the window's (HWND) perspective, there are now *two threads* concurrently updating its device context.
This results in a frame sequence similar to the following (note: the frame intervals are illustrative only):
gdi_frame | dcomp_frame | gdi_frame | gdi_frame | dcomp_frame
In this model, GDI-generated frames and DComp-generated frames are interleaved.
Is my understanding correct?
If so, I believe this method may lead to *inaccurate rendering results*, although I am not entirely certain about the exact consequences (as I am not an expert in this area).
Therefore, I have decided to pursue an approach that produces a *standardized frame stream* like the following:
gdi_mix_dcomp_frame | gdi_mix_dcomp_frame | gdi_mix_dcomp_frame
In other words, I plan to merge GDI and DComp content before presenting each frame, ensuring consistency in the final output.
I’ve reviewed the normal GDI frame refresh process, and it seems to follow this path:
--> flush_window_surfaces --> window_surface_flush --> x11drv_surface_flush
I’m planning to insert the following logic inside x11drv_surface_flush:
if (is_hwnd_bind_to_dcomp(hwnd)) { struct Surface *final = get_comp_surface(hwnd);
blend_dcomp_surface_over_gdi(dest_surface, final);
}
However, there’s still a question regarding *how to access the DComp data* inside x11drv_surface_flush.
One option is to do the composition inside IDCompositionDevice::Commit() like this: void IDCompositionDevice_Commit() { composite_proc(); ... ... push_frame_to_x11drv();
}
Another option is to *directly call into DComp APIs from within x11drv* to retrieve the surface data on demand.
However, *both approaches feel somewhat unnatural*, and from what I can tell, Windows seems to establish some kind of *internal channel* to pass the composed content to the compositor.
Is there a better or more canonical way to do this?
Any suggestions or guidance would be greatly appreciated.
Hi!
This will likely require some work within win32u, to change the way GDI is being drawn and presented and to implement the compositing engine there, where the code can access the internal surfaces.
There are plenty of NtDComposition exports from win32u, which will need to be tested, to understand how they are supposed to be combined. Then they would be used from dcomp.dll, which is probably a higher level API.
Then fwiw, although I'm not looking at dcomp specifically, I am currently making some large changes in win32u to facilitate all this, especially with VK/GL/GDI interop.
I believe the code is currently far from being ready to implement any kind of compositing, and we will need more changes to be able to manipulate GPU surfaces[^1] directly.
More specifically we currently lack any way to export these surfaces from their individual APIs and processes, and import them into a different API or process to composite together.
Incidentally, this is also very much related to D3D shared resources, which is the Windows side representation of these GPU surfaces, and which I expect dcomp.dll can make use of too and would thus require.
So overall, and sorry to say it, but I think it's a bit too soon to have a try at an actual implementation (or rather, there will be a lot of conflicting changes going on upstream). Then writing tests for dcomp.dll and for these NtDComposition functions to figure how they interact, would be interesting nevertheless.
Cheers,
[^1]: dmabuf on Linux or IOSurface on macOS
On Tue, Jun 3, 2025 at 5:27 PM Rémi Bernon rbernon@codeweavers.com wrote:
On 6/3/25 11:02, zhengxianwei wrote:
Hello,
I would like to implement part of the logic within dcomp.dll in order to properly support applications that depend on it.
I found that Zhiyi Zhang has implemented some data structures and
interface
stubs:
https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=he...
Among these, the only method with actual logic appears to be IDCompositionDevice::Commit.
However, after reviewing the implementation, I suspect there may be an issue with the current approach.
It seems that Zhiyi Zhang starts a separate thread for each IDCompositionDevice, and this thread runs at a fixed refresh rate.
But this approach appears to introduce a problem: from the window's
(HWND)
perspective, there are now *two threads* concurrently updating its device context.
This results in a frame sequence similar to the following (note: the
frame
intervals are illustrative only):
gdi_frame | dcomp_frame | gdi_frame | gdi_frame | dcomp_frame
In this model, GDI-generated frames and DComp-generated frames are interleaved.
Is my understanding correct?
If so, I believe this method may lead to *inaccurate rendering results*, although I am not entirely certain about the exact consequences (as I am not an expert in this area).
Therefore, I have decided to pursue an approach that produces a
*standardized
frame stream* like the following:
gdi_mix_dcomp_frame | gdi_mix_dcomp_frame | gdi_mix_dcomp_frame
In other words, I plan to merge GDI and DComp content before
presenting
each frame, ensuring consistency in the final output.
I’ve reviewed the normal GDI frame refresh process, and it seems to
follow
this path:
--> flush_window_surfaces --> window_surface_flush --> x11drv_surface_flush
I’m planning to insert the following logic inside x11drv_surface_flush:
if (is_hwnd_bind_to_dcomp(hwnd)) { struct Surface *final = get_comp_surface(hwnd);
blend_dcomp_surface_over_gdi(dest_surface, final);
}
However, there’s still a question regarding *how to access the DComp
data*
inside x11drv_surface_flush.
One option is to do the composition inside IDCompositionDevice::Commit() like this: void IDCompositionDevice_Commit() { composite_proc(); ... ... push_frame_to_x11drv();
}
Another option is to *directly call into DComp APIs from within x11drv*
to
retrieve the surface data on demand.
However, *both approaches feel somewhat unnatural*, and from what I can tell, Windows seems to establish some kind of *internal channel* to pass the composed content to the compositor.
Is there a better or more canonical way to do this?
Any suggestions or guidance would be greatly appreciated.
Hi!
This will likely require some work within win32u, to change the way GDI is being drawn and presented and to implement the compositing engine there, where the code can access the internal surfaces.
There are plenty of NtDComposition exports from win32u, which will need to be tested, to understand how they are supposed to be combined. Then they would be used from dcomp.dll, which is probably a higher level API.
Thank you very much! I never realized there were so many DirectComposition-related interfaces in win32u. I’ll need to analyze them further — this will probably require a lot of reverse engineering and testing.
Then fwiw, although I'm not looking at dcomp specifically, I am currently making some large changes in win32u to facilitate all this, especially with VK/GL/GDI interop.
I believe the code is currently far from being ready to implement any kind of compositing, and we will need more changes to be able to manipulate GPU surfaces[^1] directly.
*For the first version, I plan to focus on making sure the interfaces are usable and don’t crash, even if there are visual imperfections — that’s acceptable for now.*
*However, I want the overall architecture to be as complete as possible, even though that might not be entirely realistic, since Wine’s current design doesn’t account for multi-process composition.*
*For anything unnecessary, I’ll just keep it as a stub and only implement the most essential parts.*
More specifically we currently lack any way to export these surfaces from their individual APIs and processes, and import them into a different API or process to composite together.
*I plan to ignore multi-process window composition for now and focus only on handling everything within a single process.*
*Additionally, to simplify things, I’ll probably minimize GPU-related handling as much as possible.*
Incidentally, this is also very much related to D3D shared resources, which is the Windows side representation of these GPU surfaces, and which I expect dcomp.dll can make use of too and would thus require.
So overall, and sorry to say it, but I think it's a bit too soon to have a try at an actual implementation (or rather, there will be a lot of conflicting changes going on upstream). Then writing tests for dcomp.dll and for these NtDComposition functions to figure how they interact, would be interesting nevertheless.
I really appreciate your perspective — thank you for pointing out these potential risks.
Cheers,
[^1]: dmabuf on Linux or IOSurface on macOS
Rémi Bernon rbernon@codeweavers.com
On 6/3/25 17:02, zhengxianwei wrote:
Hello,
I would like to implement part of the logic within |dcomp.dll| in order to properly support applications that depend on it.
I found that Zhiyi Zhang has implemented some data structures and interface stubs:
https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=he... https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=heads
Among these, the only method with actual logic appears to be |IDCompositionDevice::Commit|.
However, after reviewing the implementation, I suspect there may be an issue with the current approach.
It seems that Zhiyi Zhang starts a separate thread for each |IDCompositionDevice|, and this thread runs at a fixed refresh rate.
But this approach appears to introduce a problem: from the window's (|HWND|) perspective, there are now *two threads* concurrently updating its device context.
This results in a frame sequence similar to the following (note: the frame intervals are illustrative only):
gdi_frame | dcomp_frame | gdi_frame | gdi_frame | dcomp_frame
In this model, GDI-generated frames and DComp-generated frames are interleaved.
Is my understanding correct?
Hi,
This is indeed possible. However, this only happens when the window is using both GDI and d2d/dcomp to render the window. Normally, it's either GDI or d2d and then composited by dcomp so usually this doesn't happen.
If so, I believe this method may lead to *inaccurate rendering results*, although I am not entirely certain about the exact consequences (as I am not an expert in this area).
Therefore, I have decided to pursue an approach that produces a *standardized frame stream* like the following:
gdi_mix_dcomp_frame | gdi_mix_dcomp_frame | gdi_mix_dcomp_frame
In other words, I plan to merge GDI and DComp content before presenting each frame, ensuring consistency in the final output.
I’ve reviewed the normal GDI frame refresh process, and it seems to follow this path:
--> flush_window_surfaces --> window_surface_flush --> x11drv_surface_flush
I’m planning to insert the following logic inside |x11drv_surface_flush|:
if (is_hwnd_bind_to_dcomp(hwnd)) { struct Surface *final = get_comp_surface(hwnd);
blend_dcomp_surface_over_gdi(dest_surface, final);
}
However, there’s still a question regarding *how to access the DComp data* inside |x11drv_surface_flush|.
One option is to do the composition inside |IDCompositionDevice::Commit()| like this:
void IDCompositionDevice_Commit() { composite_proc(); ... ... push_frame_to_x11drv();
}
Another option is to *directly call into DComp APIs from within |x11drv|* to retrieve the surface data on demand.
However, *both approaches feel somewhat unnatural*, and from what I can tell, Windows seems to establish some kind of *internal channel* to pass the composed content to the compositor.
Is there a better or more canonical way to do this?
My dcomp branch is from two years ago and very experimental. The original goal was to just get rendering with dcomp API working. On Windows, dcomp probably has access to all window content and presents composited window content during vblank. Due to Wine doesn't have a compositor at the moment and doesn't support vblank. I used a way that sort of works but by no means correct. For your problem, I guess just do whatever works for you because the current dcomp architecture is not the same as Windows.
Thanks, Zhiyi
Any suggestions or guidance would be greatly appreciated.
On Tue, Jun 3, 2025 at 5:33 PM Zhiyi Zhang zzhang@codeweavers.com wrote:
On 6/3/25 17:02, zhengxianwei wrote:
Hello,
I would like to implement part of the logic within |dcomp.dll| in order
to properly support applications that depend on it.
I found that Zhiyi Zhang has implemented some data structures and
interface stubs:
https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=he... < https://gitlab.winehq.org/zhiyi/wine/-/commits/directcomposition?ref_type=he...
Among these, the only method with actual logic appears to be
|IDCompositionDevice::Commit|.
However, after reviewing the implementation, I suspect there may be an
issue with the current approach.
It seems that Zhiyi Zhang starts a separate thread for each
|IDCompositionDevice|, and this thread runs at a fixed refresh rate.
But this approach appears to introduce a problem: from the window's
(|HWND|) perspective, there are now *two threads* concurrently updating its device context.
This results in a frame sequence similar to the following (note: the
frame intervals are illustrative only):
gdi_frame | dcomp_frame | gdi_frame | gdi_frame | dcomp_frame
In this model, GDI-generated frames and DComp-generated frames are
interleaved.
Is my understanding correct?
Hi,
This is indeed possible. However, this only happens when the window is using both GDI and d2d/dcomp to render the window. Normally, it's either GDI or d2d and then composited by dcomp so usually this doesn't happen
In a normal Windows system, GDI output is processed by DirectComposition (DComp).
However, in Wine—at least based on your code—I feel you haven’t enforced the serial relationship between GDI behavior and DComp behavior.
You simply started a thread composite_proc, which, at fixed intervals, pulls a surface from the visual’s swapchain and copies it to the HWND's DC.
But if the HWND itself performs a repaint, such as in response to a WM_PAINT message, that GDI repaint behavior bypasses your logic entirely and flushes directly to the screen.
Here's a diagram illustrating what I believe the correct behavior should be under Windows: ``` gdi \ dcomp -> dwm.exe / d2d ```
But in your current implementation, it behaves more like this ``` gdi ------------------> dwm.exe (flush_window_surface) ↘ ↗(bitblt) dcomp (composite_thread_proc) ↗ d2d ```
It’s just that GDI’s own repaint behavior seems to be rarely triggered, so your DComp logic appears to work mostly fine in some smaller cases.
That’s just my understanding—I haven’t done any thorough debugging of your code