Calling function in PHP is not cheap. One of the reasons for that executor has a lot of things to take care of when calling function – a bunch of globals, execution state, symbol tables, etc., etc. And we do a lot of allocations and reallocations for them. Also since a number of these things live on the stack – on deep recursion the stack is depleted. So I was thinking how could we improve it?
- First step could be to unite all execution-state related variables into single structure. In compile-time we know how many Ts, CVs, etc. we might need, so this is fixed. Size of other structures is known too, so we know overall memory size for every function, and we can automatically allocate execution data on the function start. Which means no reallocs, only one allocation per execution cycle and probably even better memory usage due to the reuse of the memory blocks for frequently called functions.
- Right now some of the execution data is kept in a kind of stack. But we don’t really need it to be stack – as I see, pointer to previous structure is enough. Actually, when we doing backtraces stack even is a kind of problem since we need to figure out each time where we stand and where functions begin and end.
- We do need some kind of stack for function arguments and function-call-in-progress information. However, this stack does not need to be global – we do not use this information beyond current function call (counting functions called while calculating parameters for current function call). Thus, we could just make each function keep its own stack, and since we know the maximum function call depth for the code of any given user function at compile time, this stack can have fixed size too and fit into the structure in (1).
- Once we have all call information inside the single structure, we could rewrite execute() to use loop instead of recursive call, thus dramatically reducing stack requirements and probably speeding up the execution loop. Internal function calls would still use stack, of course – because that’s how C works 🙂
Actually, we might be able to do it before but then we’d have to take care of a lot of different context things which would be very hard to do right. Having it in single structure means we can just switch one pointer and go to different context.
- All various EG’s that deal with execution state would be made to work through one global “current execution state” global pointing to the above mega-structure.
- We still need new symbol table for each call, so symbol table allocation and the related cache stays. However, we might have a good idea how many variables would each function require (size of CVs might be a good estimate) and could initialize the hashtable for this size. Downside would be that this hash won’t be then usable for other functions. So maybe we’d want to group cached tables by size (hash table implementation has only limited number of real sizes anyway). This should reduce number of reallocs when adding variables to the tables.
- Many functions are called repeatedly, but not recursively. Maybe we could reuse once-allocated memory block for each call of the function. The problem of course is to know if the function will be called again and not waste the block if it won’t – so it might be hard to do.
Any other ideas?