So I realized I had over-complicated things waaaay more than needed. I took a peak at the Quake 3 source code, to see how they dealt with bursts of packets, de-synced time, etc. and the basic algorithm is this:
1. On every incoming packet read out the remote tick (or time)
2. Compare the remote tick with our own local tick
3. If they are close, do nothing.
4. If they are tiny bit to far apart slightly nudge our local tick in the right direction
5. If they are very far apart, reset our local tick to the remote tick .
I implemented this myself, in < 5 minutes, and it seems to be working wonders, and handle all cases very well.