Hi Jacek,
On 15/10/2021 22:27, Jacek Caban wrote:
Hi Gabriel,
The main idea behind this patch looks good, but I think some more consideration would be nice to make it fit better the bigger picture.
Right now, we use IDispatch[Ex] all over jscript (that's how it's stored in jsval_t, for example) and we often needs three code paths: for IDispatch, IDispatchEx and jsdisp_t. With your patches, that we'd have four code paths, even if it's often hidden behind helpers. While we ultimately need separated code paths, I feel like we could do it architecturally better and it would be good to consider that. One way to achieve that is to abstract JS objects enough so that we'd operate on a single type everywhere outside the code dealing with the difference (dispex.c, more or less).
Note that pure external IDispatch and IDispatchEx could be represented by jsdisp_t restricted to have only proxy properties. IDispatch -> jsdisp_t mapping would be a bit tricky in this case, but some sort of rb tree should do. If we had that, we could replace all IDispatch usage with jsdisp_t, including jsval_t. jsdisp_t (or a new jsobj_t type) could then internally sort of differences between those, maybe with an internal vtbl.
I have to admit I'm a bit worried about this. Please bear with me, I'll try to be brief, but there's just a lot to talk about.
Initially, I tried something like that, trying to fit everything into jsdisp. Unfortunately, I gave up on that path because I realized it would not make the code much cleaner, if anything it would make it worse in some places—simply put, the existing jscript tests blew up, because the current implementation, despite all the code paths, is completely correct from behavioral perspective.
That's because external Dispatch objects do *not* behave like JS objects at all. They're "special". In contrast, the proxies that this patch implements for mshtml objects *do* behave almost like JS objects, which is why it works and makes sense to have them in jsdisp.
mshtml proxies have prototypes, have props, and stuff like extensible/frozen, they *do not* work with Enumerator, just like JS objects don't, and so on. Basically they make use of all the fields in jsdisp_t, more or less, and so can use the jsdisp code path just fine (with few additional paths for proxy-specific things or props).
In contrast, external Dispatch objects *do not* have these. They do not have prototypes, they don't have props stored on jscript side (in fact, that leads to a ton of test failures, just that alone), have no concept of jscript extensibility, they also *work* with Enumerator if they expose _NewEnum, unlike both mshtml proxies and standard JS objects, and have no builtin props or any other data on jscript side.
Excluding dispex.c, the different code paths now happen all throughout the code, but many of those in engine.c, object.c etc will still be needed simply to act differently because dispatch objects are *very* different from JS objects.
To me, it doesn't make much sense to store them in jsdisp, when they'll (have to) literally ignore every single field in the jsdisp and behavior, except for the class in builtin_info perhaps. Any code that touches jsdisp fields will be *wrong* for dispatch objects and will require a different code path anyway. Although instead of checking for "to_jsdisp" being NULL or not, we'd probably check for a JSCLASS_DISPATCH class, but I don't see that as an improvement since it's still a check into a different (and necessary) code path.
In fact, I'd argue it adds more potential complications to the code. First, as you mentioned, we'll need some sort of rbtree to map dispatches. This adds complexity to the code, but also runtime overhead, not just for lookups, but also for the rbtree entry, which uses 4 pointer-sized fields already for every object (5 if you count the pointer to the Dispatch).
Then we'll have to add *more* special case handling in things like variant_to_jsval, to treat dispatch objects and map them, but also in jsval_to_variant, because external code that *expects* the original dispatch pointer (and this is tested) and not something wrapped in a jsdisp dispatch, otherwise we'd break the existing tests!
This patch already does this for proxies, but (1) it doesn't require rbtrees and (2) the only reason it does it in jsval_to_variant is because mshtml builtin functions expect passed parameters to have specific IIDs in some cases, so they need to have the original dispatch passed to them. Having them wrapped in a JS object dispatch doesn't work in this case, but that's it. If there's a way to avoid that, I can drop the need to convert them back to their actual inner dispatch.
Overall I think it's not worth simply because dispatch objects are just very different from JS objects, behavior-wise, and would require most of the different code paths anyway. In fact, I believe that the way it's currently done is pretty straightforward and correct, because the dispatch itself is the "lowest common divisor" and makes sense to store that in a jsval for example.
I mean, I could conceivably add another container for all objects, that's a common divisor and avoid the pitfalls described above, but I don't see much point when IDispatch already does this. Sure you need checks for jsdisp in some cases but that's no different than using something like QueryInterface on a private interface for jsdisps.
Or I could add some "ops" or "vtable" to the container and have a ton of methods in there to handle all the code paths that require different logic between them, but is that really much of an improvement though? Issue is that it will have methods dealing with vastly different parts of the code that require different code paths (some with stuff in object.c, some with stuff in function.c, engine.c, etc).
Lastly, I want to give another example on a patch I haven't sent yet, and was planning to send after these are in, but will also be affected by this. In Function_apply we test for array or arguments class, but this is only correct in normal jscript, other modes have more quirks (that some js libraries examine or abuse).
In mshtml before ES5, external dispatch objects are *allowed* as argument to apply() as long as they expose "length" as a prop. However, JS objects that have a "length" prop *are not* allowed. ES5 makes this different, allowing JS objects as well, but not throwing errors when "length" is not found, but giving an undefined array.
So in this case, we'd still have to check there for jsdisp vs Dispatch object, because we have to reject JS objects unless they're array or arguments class, even if they expose "length", which is not the case with external Dispatch objects (in html mode only).
This is yet another example of a different code path needed in a place different from dispex.c, simply due to how native works. How would this even look like if Dispatch objects were merged into jsdisp? Would it not still require a check of some sort and code path?
Overall I'm just worried that it would be very hard not to break the existing (and future) tests without complicating the code even more than it currently is, with checks, considering that dispatch objects do not behave like jsdisps at all, and in fact any code path dealing with jsdisps and their fields will simply be wrong for Dispatch objects.
Another thing that we will need to support is proper prototype chain support for MSHTML objects. Eg. those functions should not really be properties of objects, but rather their prototypes. I think that it will be possible to express by extending something like this extension, but that's something to keep in mind. I'm not expecting you to implement all of that, I'm not even sure about details of it without trying it, but let's make sure we won't make it harder in the future.
Good point, but it shouldn't be hard to deal with; they are made to act like JS objects more or less and have prototypes (though I set them to object prototype, which is not correct I guess, but good enough for a first implementation).
So, to fix this we'd just need to set them to proper prototypes in convert_to_proxy without really touching much else, so it can be easily done, in theory. It would probably need an internal interface GetPrototype method or something, but that's not a problem.
While I'd hope for a bit cleaner solution, there is one more problem with moving all properties to jscript: cycle collector will no longer be able to deal with them. See dispex_traverse. Without cycle collector knowing about object properties, we will leak on cycles (we already don't traverse through builtin jscript objects, so there are leaks already, but this will make it worse).
I don't really understand specifics here, but it goes through dynamic props, so it should work just fine I believe?
The current proxy implementation delegates all the handling to mshtml (because it has specific quirks). The props on jscript side are just "indices" into the mshtml props (dynamic or not). So traversing the dynamic props will yield the same results as before; they're still there.
What does note_cc_edge do? Does it somehow remove the prop? If so, I'll probably have to add a way for mshtml to notify jscript to also remove the prop index, but it's not necessary for correctness because an invalid index (that points to a non-existed DISPID or a deleted dynamic prop) is the same as a deleted one right now, it just takes a slot and some memory. Even if not, it shouldn't be hard using an internal interface here to notify it.
But that depends on what note_cc_edge does.
+typedef struct { + FunctionInstance function; + IDispatchEx *proxy; + DISPID id; +} ProxyFunction;
Binding this to one exact object doesn't seem right. For example document.body.click.call(div_elem) should probably work fine and call on div_elem. Maybe it's better to store such functions as a pair of (IID,DISPID)?
Interesting, I'll have to test for this. I hadn't considered it, although I do test for incompatible objects now (and it throws the proper error, i.e. E_UNEXPECTED). But yeah, they're incompatible, so need to add tests for compatible objects of same type and see what happens.
Thanks, Gabriel