I prefer "Stronghold crusader" (2D) visually over "Stronghold 2" (3D) because the 2D graphics is round and smooth rather than flat and edgy and the perspective is barely visible.
The problem with getting the highly detailed 2D graphics is that you probably have to design everything in 3D and render to 2D eventually.
I have made a few 3D RTS games and the nice thing is to be able to move the camera if something is behind a wall. This allow buildings to be taller without affecting gameplay. A common problem with 3D is that buildings are designed for flat ground but 3D wants to have smooth hills to the ground or the buildings will have to adapt to the other in an always ugly way. Tiles will have to be a lot bigger since there is no pre-drawn background to have an infinite number of tiles in. Using one model on multiple tiles will allow better graphics and less draw calls.
I am currently experimenting with a 2D RTS where things are drawn in 2D but with depth maps to allow walking around inside a large tile without the weird overlaps that old games have. Having to draw depthmaps by hand turned out to be a real pain so I will probably have to make highly detailed voxel models or something to generate depth, ambient occlusion and normal maps. With depth maps, one can have fully animated orthographical 3D models intersecting with a 2D background as if everything was 3D.