This is information I have figured out and found out about how Thread Local Storage (that is, variables declared with __declspec(thread)) works in Win32 from the perspective of the exe file (as opposed to the perspective of the kernel). I am posting this to the ReactOS list since it needs support for this in its kernel code. I am posting this to the WINE list since it also should support this from a kernel point of view. Plus it should (ideally) support __declspec(thread) in WineGCC at some point if that is possible I am posting this to the MingW list since MingW needs __declspec(thread) support.
First, TLS in visual C++ relies on: 1.the __declspec(thread) keyword & some special stuff the compiler does when accessing TLS variables 2.the .tls segment in the PE file 3.the IMAGE_TLS_DIRECTORY32 structure (pointed at by a field in the PE header and stored in the read only data segment) and the things it points at (specificly the TlsStart, TlsEnd and TlsIndex variables) TlsIndex is stored in the read/write data segment Tls 4.the tlssup.obj file (which is inside the Visual C++ runtime library .lib files like libc.lib and msvcrt.lib) and 5.a linker feature that will combine segments in a special way if they are named right (for example, .tls will be written first, then anything labeled .tls$ then .tls$zzz. All of them will be combined into one segment labeled .tls)
Firstly, when you declare a variable as __declspec(thread), the compiler accesses it a special way. It takes the current value in the TlsIndex variable. Then it takes the value stored at offset 2C in the TEB (which contains the pointer to the TLS data area for that thread) it then does TlsPointer + TlsIndex * 4. This is what it uses when it reads and writes from Thread Local Storage variables. The first variable seems to be stored at (this address) + 4 then + 8 and so on. at (this address) is the _tls_start variable as explained below) Also, when you declare a __declspec(thread) variable, its put into the obj file inside the .tls$ segment. Most of the magic happens inside the linker. First, lets explain tlssup.obj This contains an item called __tls_used that resides in the read only data segment which then becomes the TLS directory pointed at by the PE header. This points at __tls_start which is in the .tls segment, __tls_end which is in the .tls$zzz segment and __tls_index which is in the read write data segment There is also ___xl_a and ___xl_z, ___xl_a is pointed to by the thread callbacks pointer. So, what we have in the resulting exe file is: the __tls_used variable in the read only data segment the __tls_index variable in the read write data segment and the .tls segment containing: __tls_start //all the user declared __declspec(thread) variables __tls_end And we also have ___xl_a followed by calls to the thread callback functions folowed by ___xl_z
So, basicly, to implement TLS in compilers, we need to: 1.make __declspec(thread) point at the thread local storage and get the compiler to access these variables correctly. 2.make the variables all end up in the correct place in the exe file with the TLS directory pointing at the start and end of the list 3.write the needed support code for the RTL (as needed) 4.make the callbacks (if needed) get generated & output to the exe file properly and 5.make sure that the PE header points to the TLS directory.
If anyone has anything to add or wants more details, do reply :)
As for kernel-mode support in ReactOS or whatever, that seems easy enough to implement, anyone wanna have a go?