I've no heartfelt objections over this patch.
Re dynamic flags in ddraw4, ddraw4's test_map_synchronisation() shows that DDLOCK_NOOVERWRITE is ignored. Other than fancy heuristics, I don't see how we can make GPU buffers fast with the usual map-draw-map-draw pattern.
I guess we could test if D3DVBCAPS_WRITEONLY gives us write-combined memory, but if not that doesn't prove much.