Thread-Local Storage (TLS) is one of those Windows concepts that sounds abstract until you hit a problem that’s hard to solve cleanly any other way.
In this post, I’ll explain what TLS is, where you’ve already been using it (whether you noticed or not), and then I’ll show a practical use case: avoiding a nasty recursion problem when hooking an allocation API and trying to log allocations.
What Thread-Local Storage Is
TLS is about storing information per thread, while keeping access uniform.
That means I can write code that says “get my TLS value,” and it will always return the value for the current thread. Another thread running the same code will read its own value, from its own storage, using the exact same access pattern.
The whole point is that each thread gets its own data, and threads don’t stomp on each other.
On Windows, you can think of TLS as an array of “slots” per thread:
- Every thread has an array of slots.
- An application can allocate a slot index (process-wide), and then every thread has a value for that slot (thread-specific).
- The values are pointer-sized (8 bytes on x64). If you need more data, store a pointer to a structure you allocate elsewhere.
Windows guarantees at least 64 TLS slots, and in practice you can go higher (up to 1024).
Classic TLS You’re Already Using
You’ve seen TLS patterns in both the C runtime and the Windows API.
errno And Other CRT “Globals”
In old (and current) C code, errno is a global variable. In a multi-threaded process, that’s a data race waiting to happen. One thread calls (say) fopen, while another thread does some I/O, and suddenly your “last error” isn’t your last error anymore.
So modern CRT implementations make errno effectively thread-local. In practice, it’s accessed through a function (in Visual Studio you’ll see it as a macro calling an internal function), and that function returns the per-thread value.
The same idea applies to other classic CRT functions that need per-thread state, like strtok (which keeps internal state between calls). That state can’t safely be process-global in a multi-threaded world, so it ends up being thread-local.
GetLastError And The TEB
On the Windows side, GetLastError is another obvious example: it can’t be a single global variable either.
The value is stored per thread, and you can see it in the TEB (Thread Environment Block) in a debugger. Each thread has its own TEB, so each thread has its own “last error” value.
This is TLS in spirit: uniform access (“give me my last error”) with per-thread storage under the hood.
$1300
$1040 or $104 X 10 payments
Windows Internals Master
Broadens and deepens your understanding of the inner workings of Windows.
Why This Matters For TrainSec Students
TLS is not just “a programming convenience.” It’s a mechanism you’ll run into when you’re:
- debugging multi-threaded behavior
- interpreting per-thread state like last error values
- building hooks / instrumentation without breaking the process
- reasoning about how libraries keep thread-specific context
The key mental model is: uniform access, thread-specific storage. Once that clicks, a lot of Windows behavior becomes easier to explain—and certain “impossible” problems (like passing state into a hooked function with no extra parameters) become straightforward.
Keep Learning with TrainSec
This content is part of the free TrainSec Knowledge Library, where students can deepen their understanding of Windows internals, malware analysis, and reverse engineering.Subscribe for free and continue learning with us:
https://trainsec.net/library
Liked the content?
Subscribe to the free TrainSec knowledge library, and get insider access to new content, discounts and additional materials.
































