I implemented an iterative constraint solver using impulses as described in Bender & Schmitt 2006:
Solving for the normal (penetration) constraint or the tangent constraint in isolation converges as expected, about 4-5 iterations before the penetration or tangent position is within the 'target' range (<0.0001). But running both constraints simultaneously causes 150+ iterations before they are both within target which sounds like too much for a single constraint, but maybe it is expected for a tight constraint?