Hello! You will get better feedback by moving this question to the technical forum, likely either Graphics Programming and Theory or DirectX forum
Traditionally what I've seen engines do for supporting rendering many skinned characters is the following:
1. Bone LOD and keyframe/animation LOD - An up-close biped with 50 bones and 30 keyframes per animation might have discrete or continuous LOD setup so that it has gradually less bones and/or keyframes as it gets further from the camera. If your rendering engine uses matrix palette skinning on the GPU, less bones means less work for the vertex shader in the GPU.
2. Batching and Instancing - in order to batch skinned meshes you need to create some similarities. For instance, all characters could share the same skeletons and keyframes, with only the mesh skin partitions and textures being different. Since a GPU register based implementation of matrix palette skinning will run out of registers for multiple actors on VS less than 4, you might consider a vertex texture lookup and encode animation data in texture(s). Also pick the vertex shader model with the best support for instancing (drawing many variations of an object in a single call).
3. Weapons and attachments - you can lose a lot of time here depending on how you render meshes attached to an actor. Some rendering engines will promote attached meshes to skinned meshes with one bone, which can eat up lots of time.
4. If variable textures and mesh skin partition geometry exist, this can break batching/instancing possibilities, stay aware of this and decide if you will create a texture atlas or use some other technique here.