Page 1 of 1

Optimisations

Posted: Sun Dec 01, 2013 6:00 pm
by FluffyFreak
I've been doing some profiling and digging due to getting fed up with low performance on Intel GPUs and massive stalls even on very high-end machines.

There's a PR in which aims to address some of the issue which are known to affect performance under OpenGL but I've run into a bit of a wall dealing with the actual stalls.

By "Stalls" I mean the moments when the framerate ceases to be smooth and drops from a constant 60fps for 1 to several seconds before recovering.

I think that I've tracked it down a single function but the function in question just leaves me with more quesitons than answers:
void __cdecl LuaEvent::Emit(void) 1 9782.8490(MCycles) (100%)
What that means is that it gets called just once (1) in the problem frame, but consumed 9782.8490 Million CPU cycles or 100% of the measured frame time.
The rest of the games processing for that frame didn't use up enough time to register as even 1%.

Now, LuaEvent::Emit is quite a simple function in and of itself but what I need to know is what events it's emitting so that I can track down the real culprit which I suspect is something nasty in the trade script or something similar.
Is there any way to know what events it's going to fire? I could just go through all of the registered Lua functions and add profiling code to them then try and work it out manually but that might take a while.

Andy

Re: Optimisations

Posted: Sun Dec 01, 2013 6:14 pm
by jpab
Could you confirm that with some other method? Probably the easiest thing (if the stall is long enough) is to run the game, and manually break into the game with the debugger when you're in a stall. If LuaEvent::Emit is taking such a high proportion of the stall time then the game will probably be inside it when the debugger pauses execution, and you'll be able to see the call stack and see exactly what's going on. If you have the patience for it, then you can do that several times. Of course this is basically what a sampling profiler does except that you can examine the full call stack and all state for each sample you take, at the cost of having a sample rate that's several orders of magnitude less than what a sampling profiler would give you.

Re: Optimisations

Posted: Sun Dec 01, 2013 6:26 pm
by FluffyFreak
I've tried doing the manual-sampling, I gave up and switched to the profiling approach because I'm just not fast enough :/
Not all of the stalls last as long as a second and my reactions often aren't quick enough because I have no idea when one might come along - which means long periods of waiting then one happens and I miss it.

I stuck a load more profiling code in and it looks like it's a call to "GetNearbySystems", I've got a profile here showing 2 calls too it using 99% of the CPU time for a frame.
Seems to do 44437 calls to "Faction::IsCloserAndContains" in just those two. Maybe it's just simply a bad algorithm or too big a scope or something.