I would love to kill the key cache entirely - but it's not clear how much of a performance regression it would actually be to do so, or what real applications the cache was meant to improve performance on when it was added 12+ years ago. Maintaining the cache certainly has created a lot of code complexity since then but this is the minimum change I need to fix the app I'm looking at.
Without this change, key state becomes inconsistent between windowproc threads and pooled worker threads, because key-down events only update the cache in a single receiving thread without invalidating any other thread caches.