Hi, I've implemented Morton Code via the 'magic bits' method as mentioned in https://www.forceflow.be/2013/10/07/morton-encodingdecoding-through-bit-interleaving-implementations/ and the real-time collision detection book, but I've yet to see the performance gains that it promises over array[x + y * width] even when increasing sizes to 16k * 16k. Do I have to implement the LUT method? Compiler is MSVC, and the particular problem I've integrated it into is a marching squares opcode determination program. The morton encoding functions are inlined, i've checked.
Seems to me it might not be the best for known/fixed spaces. Has anyone reached similar conclusions?