Additionally to frustum culling you can use the light cut off to cull anything too far away.
If the light is static you can precompute depth for static scene and render only dynamic objects.
You can update shadow maps only each Nth frame, risking some self shadowing on moving objects.
You can do shadows for dynamic objects only near the camera, and far away use only static shadows.
So, nothing magic - it is like you expected
Edit:
Actually there is a lot attention spent to the idea of a voxelized scene representation converted to signed distance field. You need to voxelize only once to support any number of lights, but the visibility test becomes ray tracing instead of a quick texture fetch.