We are not doing that on purpose, freeing the TEB before the thread is finished causes hard to debug crashes.
There should only ever be one extra allocated TEB so that should be acceptable. What problem are you seeing exactly?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2814#note_33746
From what I can tell, the recent work on SampleBias/SampleLevel did all of the work for us, we just need to take the same framework and include the case for the `sample_l` instruction.
Tested with some shaders from the native Linux version of Little Racers STREET.
--
v14: vkd3d-shader/tpf: Add support for emitting sample_l instructions
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/188