I'm in the process of writing a software renderer raycaster. I've actually done this before a very long time ago on a 386. Back in the day, in order to get a good frame rate, I had to incorporate several programming tricks, and I'm wondering if they are still relevant. Even if they are, do modern compilers simply do it for me, and what about inline assembly, does that also get reordered?
- Alternating 32 and 16 bit instructions. The 386 introduced 32 bit register, but maintained backwards compatibility by having the lower half act as 16 bit registers. For multi clock cycle instructions, the CPU could overlap them, so you'd write code to go something like ADD EAX EBX, SUB CX DX…
- Alternating int and float math. Most machines at the time did floating point math in a coprocessor, which ran in parallel with the main CPU. By alternating 2 int, 1 float, 2 int, 1 float, you'd get essentially parallel processing.
- Jumping on less likely branch. The compiler I used would turn if(…)then(x)else(y) into cmp,jne,x… So you'd always make the most likely code to run away for the else
With the improvements of branch prediction and pipelining in the last several generations of chips, are any of these still relevant?