and those currently fail, because that case isn't handled at all in the FVF loading code. It thinks this is a multi-stream case, when in fact only the 0th stream is used. Jason Green has a demo - dx9_hdr_texture_loader, which is broken exactly by this.
How about getting rid about the fvf in the device entirely and converting the fvf to a vertex decl in SetFVF? This would make the drawprim code simpler(only vertex declaration to consider) and avoid inconsistency. In GetFVF we convert the declaration back to a FVF(and propably cache the result, but don't use that for rendering).