It has a GPU in the form of drawing graphics. It was quite different from what you expect with today's 3D-centric graphics cards that handle dense point clouds and gigabytes of textures.
Mostly the system displayed 2D graphics, images that were scaled, flipped, and rotated. The system could display a limited amount of 3D models and meshes.
The hardware did not support floating point, so everything needed to be done with fixed point.
The hardware also didn't support division built-in, it had a co-processor you could send off division operations and square root operations for asynchronous computing.
Quite frequently games would provide a 2D world through simply manipulated images, then draw a few 3D elements on top. Careful manipulation of the depth buffer (through depth images and similar) allowed the 3D characters to walk behind, underneath, and otherwise interact with 2D objects in a believable way. As an example, you might have a 2D drawing of a tiki hut which has depth information saying the roof is near the camera, but the floor is farther away, so when the 3D character walks under the hut they are properly occluded. Layered depth images could allow two or three highly detailed 2D drawings to seamlessly interact with the low-detail 3D character.