So, I've gotten the game fully back to where it was before. Works great. But I have a pretty big project to tackle now.
You see, a LOT of the game functionality is written as Lua script objects attached as script object components to Urho3D Nodes. Which is great and all, but the thing is that the Lua bindings use tolua++ to generate the bindings. It is an issue (known to the Urho3D team and the users on the Urho3D forums) that bindings generated using tolua++ tend to be slow and non-optimal. It's a thing I've run into before in my own stuff as well.
The issue is in the way tolua handles taking ownership of objects. You can specify that certain objects be "owned" by tolua; that way, when they are out of scope they will be garbage collected. The way tolua.takeownership works, though, is that for each call to takeownership, a full garbage collection cycle is performed. And any time an API method or function, bound to Lua, returns an object by value then that returned object is automatically taken ownership of by tolua. Meaning that for every single one, a full garbage collection cycle is performed.
If you think about it, that can turn into an ungodly number of garbage collection cycles per frame. That means that it will spend a lot of time in the gc that it doesn't need to. You can mitigate some of it by explicitly calling new/delete to create your own local objects and temporaries, but there is not much you can do about the full gc pass done for objects returned by value. You can see in this shot...
... how the update portion spends almost half of its time inside LuaCollectGarbage. And that is even after an optimization pass where I reduced the number of locals being taken over by tolua as much as I could. By necessity, there are still just a lot of places where objects are being returned by value. Call GetPosition() on a node? That's a gc cycle. Iterate the returned results of a raycast? Some more gc cycles. It adds up.
So my project for the next few days/weeks (depending on work and stuff) is to begin moving components out of script objects and into dedicated C++ components to try to reduce the gc overhead. It would probably be helpful to switch to AngelScript, since the AngelScript API for Urho3D works better than the Lua API, but that's something I really don't want to do unless I have to. I've already significantly reduced the wasted time by implementing a few of the more widespread script objects with C++ components, so I'll see where I sit in a few days.
I'm talking mostly about nuts-and-bolts stuff, here. I still will leave as much of the gameplay and AI type stuff in Lua, but with some potential redesigns to help reduce gc waste even further.
I game is looking amazing! Hopefully, your efforts will payoff. You have made great leaps from the first time I saw your game. Continue the awesome work.