I made a little progress.
In my system horizontal merging is a lot more important than vertical because the visibility system is essentially 2D. Every object has an upper and a lower bound so they can be included/excluded in the visibility calculations as the observer moves up and down, but the construction of the line-of-sight geometry is all done in the X-Z-plane (Y is up in my engine).
So I came up with this algorithm (dunno if this is an established way of doing things, but it is working fine for me):
- Get a list of objects that we consider for merging, called the "main list" below
- Pick the first object in the list, called "reference box", or "ref box" below
- Check if the shape of the object is a box, if not we cannot merge it so just submit it as-is
- If the shape is a box, we will try to merge it with other boxes
- Get a new list, called the "merge list" containing all objects except the ref box
- Iterate over all objects in the list and process those that are boxes, sorted by distance to the ref box (closest first)
- For each box, check if the upper and lower bounds are the same as the ref box, if not then we cannot merge
- If the bounds match the ref box, calculate 2D convex hulls for both shapes in the X-Z-plane, then calculate the areas of the convex hulls
- Calculate the convex hull of the combined shape, then calculate the area of that convex hull
- Check if the sum of the original areas of the convex hulls roughly equal the new combined area
- If the areas are equal, we can merge the boxes together
- Continue with the rest of the objects in the merge list, attempting to merge each with with the ever-growing ref box
- Remove all processed objects from the main list
Using this technique I managed to nicely merge most of the geometry in my test level. Check out the picture below:

The colored boxes represent box objects. The colors are selected from a list of a few colors and sometimes the same color appears on adjacent objects. From the image you can see that the algoritm works with both world space axis aligned objects as well as rotated ones. You can also see (on the wall of boxes in the lower left corner) that the algorithm does not merge objects vertically.
In the test level above the number of objects decreased from 266 to 45.
Fun!