On 5/5/2019 at 11:10 AM, ahmadierfan99 said:
Which Hardware(s) did you run your app on and how important was that?
I ask you this because in your previous replies I understood an algorithm performance and superiority to other algorithms heavily depends on the hardware.
The final hardware is usually minimally important. Most of the things people fear turn out to be unimportant when it comes to performance. People spend a lot of effort (and fear) that an algorithm won't work well on some specific system, but in practice, usually it isn't an issue at all.
But sometimes the hardware is critically important. Sometimes there are things on the critical performance path and one system happens to have radically different performance than the others. You can ONLY determine that with careful measurements. It usually cannot be figured out in advance.
Many factors of algorithms do matter in general. First, not doing work is faster than doing work. Doing less work is generally faster than doing more work. Doing a small amount of work up front to prune out an enormous amount of work later is generally a good idea. Said differently: Don't be stupid. If you have a choice between using quicksort in a library versus rolling your own bubblesort, use the quicksort. Don't take pessimizations nor follow known problems.
On 5/5/2019 at 11:10 AM, ahmadierfan99 said:
How did you budget your "subsystem's updates"?
You already know how long you've got in total. If your target is 60 frames per second from the old television standards (which are long outdated) you have about 15 milliseconds total. If your target is 72 or 75 or 120 or 144 or even the 288 Hz that some high-end gamer displays are pushing, you can calculate how many microseconds you've got for your target.
From there, start with a measurement of where it is now. If you run all your updates in a single thread, you know that simulating + rendering + networking + audio + whatever must be less than the target of 15 milliseconds. If you measure today and see that simulating is taking 4 milliseconds, rendering is peaking at 6 milliseconds, audio is peaking at 1 millisecond, networking is peaking at 3 milliseconds, then you have (4+6+1+3=14 milliseconds) so you know you're at risk of hitting a 15 millisecond target if everything hits a peak frame. But if you're looking at 7 or 6 or 10 milliseconds when you combine all your times, make those your limits.
Over time as you gain experience, you learn that one system or another will become slow. Pathfinding has the potential to be slow. Many simulation steps like physics updates have the potential to be slow. Measure and monitor, then make decisions based on the details of that specific game.
On 5/5/2019 at 11:10 AM, ahmadierfan99 said:
Was there a time which you had to refer to your code's compiled assembly or referred to CPU Instruction Cycles on your specific hardware to see the actual or estimated time needed to do a specific job?
CPU instruction times haven't been a thing on modern hardware for about two decades. There are too many inter-dependencies on surrounding code. Multiprocessing context switches, hardware parallelism, hyperthreading, out-of-order cores, and long CPU pipelines all combine to make this nearly useless.
Yes, there are times when people need to look at the raw assembly code. It is fairly rare, getting rarer every day, but it still happens. It usually isn't necessary until you hit very large professional games trying to fine-tune a specific segment of code. It best to swap out with better algorithm choices, but sometimes you hit the point where there are no better algorithms and you need to hand-tune the details even more.
Normally there are better options, different algorithm choices, and different broad design choices that are more effective at finding performance gains.