You can definitely cull parts of your scene for directional shadow mapping, and as TomKQT mentions, the problem is to find the correct view frustum (which in this case will just be a box due to the orthographic projection) to cull against.
One option is indeed to render the entire scene, in which case the view volume will be a light-space oriented bounding box of the scene. While not very optimal, I've found this to be a good starting point to check if everything is working as expected.
Another is to actually compute a tight bounding box around objects required for shadow rendering. Basically, the view volume should enclose only the part of the scene, that is currently visible (i.e. whatever is inside the main camera's view frustum), plus whatever objects can cast a shadow into it. With this approach you compute a light-space bounding box around main camera's view frustum, but shift its near plane all the way back to the scene bound, repeating the process every frame to account for the main view changes.
A few more pointers:
- cascaded shadow mapping mentioned above uses that approach, but you don't actually need to implement the cascades - or you can view it as single-cascade CSM. Anyway, it should be easy enough to extend to cascades form that point, if the need arises.
- the volume produced by this technique is conservative - there may be optimizations to further reduce the rendered object count, that I don't know of.
- the fact, that the light view volume is now shifting whenever the main view moves or rotates, and that there's no 1:1 correlation between the screen pixels and shadowmap texels, causes artifacts with shadow stability. You overcome those by making sure that your shadow map is translation and rotation invariant. It's a fairly big topic, again covered by many CSM descriptions.
- same artifacts are produced whenever the shadow casting light moves (think dynamic day-night cycle). I believe this one is more complicated (if possible) to solve. I saw several games move the sun/moon in short bursts, to limit the time when the effect is visible, but don't recall any other ways to improve this, apart from increasing shadow resolution.