Here's a summary of the theory, from Clemson University. For 2D graphics, you use a 3x3 matrix to represent rotation, translation, and scaling. For 3D graphics, you use a 4x4 matrix.
Here's the Microsoft discussion of how to do it in C#.
There are many writeups on this, and even a Coursera course. Search for “affine transformations 2D graphics”.
Essentially all 3D graphics is done this way. GPUs have 4x4 matrix multiply hardware to do that specific operation very fast. This is a basic concept in game programming.