Advertisement

New: tiny_bvh.h single header library

Started by November 11, 2024 08:08 AM
5 comments, last by JoeJ 1 week, 3 days ago

Last week I released on github a new single-header file library for constructing and traversing BVHs on the CPU, with (short-term) plans to take traversal (not construction) to the GPU as well. Link:

https://github.com/jbikker/tinybvh

Features:

  • No dependencies at all. Just add the header to your project and it works.
  • Simple interface. Build using a flat array of triangle vertices.
  • Builds state-of-the-art SAH binned BVH using AVX - 34ms for 260k triangles.
  • Or, ~250ms for the same data without AVX, for cross-platform purposes.
  • Builds wide BVHs: 4-wide and 8-wide, also in GPU-friendly format.
  • BVH refitting.
  • BVH optimizing: post-process your static scene BVH for ~15% more speed.
  • Comparison against Embree is included (not winning yet but closing in).

Coming up:

  • CWBVH for billions of rays/second on the GPU.
  • OpenCL examples.
  • TLAS/BLAS traversal.

The code is an implementation and continuation of my articles on BVH construction:

https://jacco.ompf2.com/2022/04/13/how-to-build-a-bvh-part-1-basics

Support / questions:

Greets

Jacco.

BVH libraries are like OCTree libraries. Easy to make and always tailored to a specific problem. That is why nobody really cares about standardizing them - you just quickly whip/code one out when you need it.

For example, my geometry uses faces that are constructed of one positive polygon and maybe a few negative ones (holes). I don't work with triangles. I need BVH, but I can't really use your library.

Having said all that, If someone want just triangle BVH class, It can just use your library if it is OK with having extra 3D math and stuff.

Also, I've noticed that you can compress your data quite a bit. This is your structure (64 bytes):

bvhvec3 lmin; unsigned int left;
bvhvec3 lmax; unsigned int right;
bvhvec3 rmin; unsigned int triCount;
bvhvec3 rmax; unsigned int firstTri; // total: 64 bytes

Alternatively, node could be this (30 byte):

bvhvec3 obbMin, obbMax;
unsigned short triCount; // if 0 then we know that we are not leaf node and firstChildID is left node
union{
int firstChildID; // Used when triCount == 0. This represent left node, and right node = firstChildID + 1
int firstTriIdx; // Used when triCount ≠ 0.
};

Advertisement

I'm sorry, but good BVH code is not something you ‘quickly whip up’. There is a massive number of papers on the topic, which just recently somewhat came to rest with the advance of RTX/DXR (which turns BVH construction into a black box). You can keep things simple, but there is a large gap between a naive BVH builder and a good one.

About your BVH layout suggestion: The layout you're optimizing is a specific layout tailored for fast GPU BVH traversal, based on a paper by Aila & Laine ("Understanding the efficiency of ray traversal on GPUs"). It stores not one AABB, but two: One for each child node. This is done to maximize compute density while lowering memory divergence and exploits an important conclusion of the paper: Ray tracing is compute-bound, not memory-bound.

A more memory-friendly layout is available in tiny_bvh: The default one. It closely follows your suggestion, with one important difference: obbMin and obbMax both start at a 16-byte boundary to allow SIMD processing. This makes the structure 32 bytes (same as yours).

More information about BVHs can be found in my blog posts on the topic.

iko iko said:
For example, my geometry uses faces that are constructed of one positive polygon and maybe a few negative ones (holes). I don't work with triangles. I need BVH, but I can't really use your library.

I have it this way: The BVH uses bounding boxes, not triangles, polygon or anything specific.
So to make BVH from a mesh, i first have to create a bounding box for each triangle, then build the BVH from that.
Box index matches triangle index, so there is no extra level of indirection.

Maybe that's pretty general and a good idea.
But i can't tell. I have used it only once and still code new BVH builders specifically for what's needed… ; )

phantomus said:
There is a massive number of papers on the topic, which just recently somewhat came to rest with the advance of RTX/DXR (which turns BVH construction into a black box)

I.e. one of the things I regularly complain about - although I still hope that one day this fixed function will get removed (much like fixed function got removed from raster pipelines on graphics hardware in the past).

phantomus said:
CWBVH for billions of rays/second on the GPU.

I'm VERY curious about this - as searching for direct comparison to my cwbvh implementation (for both - building and traversal) is hard (ofc there is embree - but any other impl. is welcome) … at least for native geometry (I don't really know if there is any comparison of using cwbvh for dynlod/virtual geometry … I do recall Intel having tried 6-wide bvh for this).

JoeJ said:
The BVH uses bounding boxes, not triangles, polygon or anything specific

If you want to go crazy - it needs to use bounding volume, not necessarily boxes (spheres are another sane object … but what about bounding torus? 11 )

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Vilem Otte said:
but what about bounding torus?

The most exotic shape i have tried is a ‘clipped bounding sphere’, which is a sphere around a cluster of geometry, plus two planes to clip it at min and max distance along the average cluster normal.
This should do almost as well as oriented boxes but needs less memory.
For my stuff it ended up still too complicated and i ended up at just bounding spheres, but i think it has its applications.

Advertisement