First thing I will point out that you seem to be misunderstanding, is that a server doesn't typically deal in 'frames'. A server may not be rendering anything at all, a server usually deals with ticks, often at a fixed tick rate. A client has frames, but these are typically interpolations between ticks. This is the scenario I believe described in Glenn's article, and he describes it in several of his other articles.
A server maintains the authoritative physics simulation of the world, all the actors and objects. The server then regularly (e.g. every server tick or 2, depending on tick rate etc) sends out snapshots to clients. The server doesn't need to send a snapshot containing all objects in the world to every client, only ones that are potentially visible to the client. This is where spatial partitioning schemes such as PVS can come in useful.
The client receives snapshots, and interpolates (or extrapolates if that is your thing) to give an approximation of what is going on in the authoritative server game. The client also typically runs it's own matching physics simulation for the player, and compares the result with the authoritative player position from the server in the snapshot. If they match, this is all good, if the server places the player somewhere different, the client must change its simulation to match the authoritative server simulation. This is called 'client side prediction'. Note that you can also do physics prediction for all game objects on the client in a similar way, which is a little more complex and more CPU hungry on clients, however this is not described in the article, and in most cases the server only approach he describes can work fine.
The crucial thing to understand is the tick-based scheme of running the game, and move away from thinking in terms of frames. Frames are only used for giving a smooth view to inferior humans between ticks, the real game is tick based.
This post I wrote recently should help explain the tick system:
In my experience a lot of people have trouble initially understanding the tick system, and especially the non-linear progression of time. As animals we are used to seeing time as linear, so have trouble understanding that a simulation can calculate time steps in a non-linear fashion, then use interpolation to smooth this back to a 'human view'. There can be a 'eureka' moment when people get it.