Yeah I would assume that wasted RAM is the only downside (and load times).
In my recent test on a game that's already shipped using D3D11 (internal D3D12 port), it took about 5 seconds to create "base" PSOs for each permutation (where "base" is a default raster/blend/depth-stencil, but all other data permuted to cover all possible uses as declared by the game's metadata). That's a long time to block a main thread for initially - you'd probably want to do it separately for your main menu shaders to get into game quickly, and then do the rest afterwards.
We actually support creating PSOs on demand (a thread safe hashmap is searched for a PSO when creating a drawable, and failures result in the requested PSO being compiled and added to the map) so all we do is spin up a background thread that starts doing this many-seconds worth of work concurrently. During that time we might get 'cache misses' where a PSO lookup in the hash table fails and one of the main threads has to create it on demand, but after the background thread completes, the cache should be fully warmed up.
The other cost is simply managerial - how do you know all your PSOs at load time? We force shader authors to declare which render target formats, MSAA modes, primitive topologies, index buffer formats (for the IB strip cut value) and vertex buffer layouts their shaders are compatible with. This is a pain for shader authors, but does give you a lot of knowledge about PSOS requirements at build time. We also use this data in the other rendering back-ends in development builds (e.g. D3D11) and assert if a shader is used in a way that wasn't declared by its author.