You can do authentication and lobby over HTTP/S/2; you only need the web socket for the low latency push bits. (But when you have it, you can of course do more with it.)
Once authenticated, issue some kind of token that the websocket connection can use to authorize itself on connection setup, similar to how you'd use a session identifier cookie in HTTP.
There are three kinds of ways of setting up services:
1) Each "endpoint" or "service" lives in its own process. Use cheap ways of spawning lots of processes -- kubernetes, nomad, swarm, ... Use some kind of smart HTTP router on the way in to send the right requests to the right processes. (NGINX with path and host matching rules, for example.) Most solutions like these call themselves "microservices." This makes it very easy to develop things in parallel across large teams, but instead has a higher operational cost, as well as the communication cost to make sure that all the different services stay to the same data structures, rules, and protocols.
2) There's a binary/program, that can "do anything." You might spin up multiple copies of it (if it's stateless or uses shared memory state or sharding,) but it's fundamentally "one thing." This is usually known as "monolith architecture," and is super easy to get started with, and to test in a developer sandbox. When your system gets really big, it starts causing development/deployment friction on large teams. (Wordpress and other PHP sites fall into this category, even though they may have different entry point scripts for different pages/services.)
3) There's a binary/program per major functional component. Maybe sign-in and lobby is one, game serving is one, and microtransaction store is one. Each component may be built using a different technology stack, or not, but the routing is typically set up entirely by host -- lobby.game.com, store.game.com, play.game.com etc. This is somewhat of a middle ground between microservices, and monoliths. It lets you choose "the right tool for the job" for specialized systems, but it still has some cost and overhead in that your system is diverse.
Separate from those three ways of slicing things, there's the question of horizontal scalability. I e, if you want to be able to "start more processes of the same kind" to serve more load, then whatever service/es is run by those processes, need to be somehow okay with load balancing, where different players end up using different instances of the same service code. Typically, for web/database systems, this is not so bad, as the state lives in the database, but for persistent game processes, you end up doing "sharding," where players can only directly interact with other players on the same physical server. If you have a system that can do 10,000 players per server, and want 100,000 people all in the same city market square in your world, all being able to talk to each other and interact with each other, you can't just spawn 10 servers that deal with 10,000 players each, because there's a 9/10 chance that the player will want to interact with some player that's on another one of those servers, and all servers end up needing to know about all 100,000 players, which ends up not saving any capacity. (There are in turn various ways of trying to solve this, which have significant gameplay and technology trade-offs. My recommendation is not going there unless you are 100% sure that you absolutely need to!)