The only other option I'm aware of is to await the shader-based fixed function pipeline replacement which might contain an implementation of the D3DRS_VERTEXBLEND renderstates.
I worked on a pipeline replacement using GL_ARB_vertex_program a few months ago but hit some roadblocks, so I never finished it. The main problem was that the performance was awful. Even in apps like 3Dmark 2000, which are hit rather badly by drawStridedSlow it resulted in a net slowdown. I am pretty sure my code can be optimized(the lighting code is *awful*), but it certainly complicates things if I have to be really picky wrt performance from the start to avoid causing regressions.