On 10/24/19 3:32 PM, Gabriel Ivăncescu wrote:
Hi Jacek,
On 10/23/19 9:16 PM, Jacek Caban wrote:
Hi Gabriel,
On 10/21/19 1:40 PM, Gabriel Ivăncescu wrote:
The heap stores dynamic global variables added during the execution phase. We'll need to reference them in the TypeInfo, and so we have to reference count the heap itself, and the simplest way is to use the script dispatch object to store it.
It seems to me that you concentrated a bit too much at avoiding copying the data rather than on root of the problem. It's possible to avoid copying in different ways. The thing is that having to represent the collection of global members in different ways for similar tasks probably means that representation of the collection is not right. Actually, we know that current global scope representation is not great. A simple list was nice to have things done and working, but it's surely not optimal. I don't expect you to rewrite it all, but it will be eventually rewritten and it would be good to know that we're changing things in the right direction instead of working on something that will need to be rewritten for future improvements.
Now that I've looked deeper into the vbscript implementation, I actually think the current one is wrong and don't quite understand how it even works, honestly.
It looks to me as the compilation phase that creates the vbscode_t that is then appened to the ctx->code_list adds functions and variables, and they are immediately linked to the context's global_funcs/vars respectively. This happens even *if* the code in question has "pending_exec" so it's not even executed immediately. Shouldn't those funcs/vars only be added to the global context list when they are being executed?
Maybe, I'm not sure. It's not obviously wrong. Testing that would be nice.
Furthermore, the funcs/vars during compilation are added to the vbscode_t's heap and stored there, while vars added during execution are added to the context's heap.
release_script will go through all global_vars to release their variant data, but in destroy_script, the code_list is released (and all vbscode_t heaps) *before* the script is released. Doesn't this mean some variables in the linked list will have already been freed (if they were part of some vbscode_t) and thus we're accessing freed memory when releasing the vars?
Overall I'm just slightly confused as to how it even works right now, so I've no idea how to improve it.
Well, it definitely could use a clean up. Note that before destroy_script() we always set the state to closed in the destructor, so in practice the release_script() is always called before destroy_script() and there is no problem. We should eventually get rid of destroy_script(). If we want script_ctx_t ref-counted, it could be a part of release_script() and we could have separated detach_script() that would be used for closing script engine.
Here's my thought process on how to proceed, but I could be completely wrong.
First, store the global_vars/funcs in the script dispatch object, instead of the context (just like jscript does). Furthermore, when a code in the code_list is being executed, move it to the script dispatch object (for ref counting) and remove it from the context's list. (we'll essentially just have another list in the script dispatch object for code that has been executed).
This will allow to release the script by decreasing its state, releasing the code that was executed already, while *keeping* the code that is still pending, which I believe was the intent of the whole thing. Again, I could be misunderstanding this.
I'm not sure. We don't really need the whole vbscode_t in script dispatch. For the variables, we just need just a name. For function, it's a name, public flag and argument count. The rest could be left in vbscode_t and be freed with it on script engine close. We could allocate that subset of data, for example, in a dedicated heap pool like we currently do for dynamic variables. We could do that in the compiler already.
Secondly, link the compiled funcs/vars to the script dispatch object during the execution phase of that vbscode_t, instead of during compilation as we have now.
If tests confirm that, sure. If you don't want to deal with it, just leave it as is.
Does this seem a good way forward?
With this, the TypeInfo will just grab a reference to the script dispatch (which it has to anyway due making use of the ident_map)
I'd rather get rid of ident_map, see bellow.
and possibly create arrays for vars/funcs like a snapshot, so they have the indices for the TypeInfo at the moment it is retrieved.
Once we have arrays that are don't change existing data, but only grow, then and all we need to capture the state is current element count.
I don't know exactly how it should look like. It's even possible that we will have to do some copies to capture the state after all. If I was looking at it, I'd start with more experimentations. Your tests seem like a good start. It would be interesting to see how things change when you add more global members later. You tested that adding new global variables doesn't change the existing ITypeInfo. How does the new one look like? Does it append new variables after previous ones? What if we added a new "dynamic" (non-explicit) variable? What if we mix it all? How about functions? What do DISPIDs look like? Is it reasonable to try to replicate native DISPIDs? Once we know more answers, we may decide how the collection should look like.
After more fiddling here's some bits of info:
The TypeInfo is frozen at the moment it is retrieved. Whatever happens to the script dispatch after that should not affect it. To simplify the explanation I will talk about "variables" but the same applies to functions as well.
If some variables are added and a new TypeInfo is retrieved, the new TypeInfo will contain those new variables, and their indices will be after the previous ones (i.e. they're appended).
I don't think it's worth replicating DISPIDs, because those are opaque. AFAIK no application should hard-code DISPIDs since they may also change between Windows versions (from experiments it does seem Windows tends to simply order the DISPIDs as they are found in the script, though, but separates vars/funcs).
However, the *indices* of the variables should probably be the same as on Windows. This is because you can obtain variables by index using GetVarDesc (again, same for functions). But, when it adds code, explicit variables are added first, then non-explicit variables. If more code is added, it will be appended to this, so they *can* be mixed. (e.g. first code adds 2 explicit vars, then 1 implicit var, second code adds 1 explicit var and 2 implicit, it will be: e->e->i->e->i->i).
Cool, thanks for testing. It sounds like replacing the existing lists with arrays like that is what we need.
To get rid of ident_map, I'd suggest to change how DISPIDs work. DISPIDs could be just indexes in variable or function arrays, with a flag on one of high bits indicating which one is it.
Thanks,
Jacek