Advertisement

WebSocket Integration in an ECS Architecture

Started by August 06, 2023 09:09 PM
7 comments, last by marem83 1 year, 5 months ago

Hello everyone!

First I'd like to be transparent and say as this is my first time posting on these forums (although I have been an avid reader!), so please let me know if I did not follow any posting guidelines that I might have missed.

I'm a software engineer who has been pursuing game development as a hobby for the past 3 to 4 years. I'm currently exploring game networking for a fast-paced 2D shooter game. While I've delved into various aspects of game dev, networking is a new challenge for me. I'm using an ECS architecture with Typescript/HTML canvas for the client and Typescript/node.js for the server, employing WebSockets for communication.

I'm facing an issue integrating WebSocket initialization and events management with ECS. My server initiates the WebSocket server and, on a new connection, a function handles initializing the network event callbacks. A significant amount of functionality is within this single function (e.g. calls to world.createEntity and world.destroyEntity), which seems inefficient. My challenge is determining where to process WebSocket messages within the ECS structure.

As an example, let's assume that the following JSON message is sent by a game client to the server:

{
	"type": "command",
	"frame": 54,
	"payload": {
		"moveRight": true, 
		"fire": true
	}
}

Each of my player entities wear a CommandStreamComponent that contains the past X frames of input that the server can use as it progresses through its simulation. This component looks a little like this:

class CommandStreamComponent extends Component {
	commands: CircularBuffer<CommandMessage>
}

class CommandMessage{
	frameNumber: number
	command: Command
}

class Command{
	moveRight: boolean = false
	moveLeft: boolean = false
	fire: boolean = false
}

What I am missing here, is the connection between the WebSocket events callbacks (e.g. on('message', …)) and the ECS components/Systems. In other words, I aim for seamless integration where network event messages are smoothly channeled into the ECS workflow for various types of messages.

How can this be accomplished without compromising clarity? How should WebSockets (or any other full duplex communication solution) be integrated into an ECS architecture? I'm not religiously complying with pure ECS principles but aim for a clean networking approach within it. Insights from experienced developers in ECS and/or netcode would be invaluable.

TLDR: Need a clean method for integrating WebSockets and passing network messages within an ECS flow. Some resources I've used while studying games networking include :

None

My personal feeling is that this kind of low-level thing has no place in an ECS. ECS is really only useful for composing arbitrary data-driven entities from separate components (entity is sum of its parts). This kind of architecture is not relevant for most low-level problems, such as networking or doing physics simulation. Generally at a low level you have precise knowledge of the data types and so there is no need for the composition abilities that ECS provides. Use ECS for only gameplay code where it can actually be a benefit rather than a hindrance.

Instead, you should have a NetworkSystem or something like that in your engine which manages the interface between low-level networking and the rest of the engine (the ECS). It should abstract away the networking so that the rest of the engine doesn't care where the information is coming from. The internals of NetworkSystem would depend on what kind of game you are making. It might do stuff like receive object positions from the socket and then apply them to the relevant objects in the engine, or it might send events to other systems to notify them of some condition or data.

You might still need a NetworkComponent if there is a need for each entity to have some local data related to networking, such as your command buffers. Alternatively, you could store all the networking state in the NetworkSystem, and use a hash map to determine the NetworkComponent for a given entity.

Advertisement

I see!

Fore sure there's an argument to be made about keeping the game logic in the ECS and the rest outside. This is actually what I started doing originally for networking (although it's more of a collection of functions at the moment than an actual network manager), and also what I am doing with UI (systems publish events, UI elements subscribe to them). It works relatively well.

However, my concerns about a centralized NetworkManager are the following:

  • A centralized NetworkManager which would handle all network related operations would introduce tight coupling between it and several Systems/Components, which is what I am attempting to avoid with ECS, by having small components and small Systems that mostly do one thing. It makes testing a bit more cumbersome as well, as I'd have to mock it for various systems,
  • I am also concerned about an event-based manager which would perform updates to the components in the game, as I would not have much control over when these updates happen in the server tick lifecycle. This seem even worse for the creation or deletion of entities.

I actually thought today of simply having a lightweight NetworkEventListener which would push the network messages in a message queue, which would itself be consumed by a NetworkMessageSystem in ECS, exactly when I need them to be dequeued. I would still need to mock the Network messages queue to unit test the NetworkMessageSystem, but that would not really be an issue as it'd be done in one place only.

Thanks for the insight! ?

None

marem83 said:
My challenge is determining where to process WebSocket messages within the ECS structure. … I aim for seamless integration where network event messages are smoothly channeled into the ECS workflow for various types of messages.

Can you explain more what you mean here?

While you linked to a bunch of articles, I'm not sure you've deeply understood what they say.

marem83 said:
A centralized NetworkManager which would handle all network related operations would introduce tight coupling between it and several Systems/Components, which is what I am attempting to avoid with ECS

Yes, done badly it can have tight coupling. The network messaging system typically is unrelated to ECS in the games I've worked on.

You do need to have a system that handles how messages are serialized and deserialized, how connections are managed, how security and encryption take place, and so on. How that system binds with the rest of the game is highly dependent on the game's design and implementation.

marem83 said:
I am also concerned about an event-based manager which would perform updates to the components in the game, as I would not have much control over when these updates happen in the server tick lifecycle. This seem even worse for the creation or deletion of entities.

Most (but not all) network systems work best with state-based events. It is far less data to send a message “player 1 is now walking” than it is to send a message every simulation tick that position has been updated. The first done well generates a single message that can be a less than a byte, the second done badly can generate a kilobyte of updates every second.

Creation and deletion of entities should happen relatively infrequently, but there are events for those as well. In many designs there are two ID's, one is the network ID and the other is the local ID, which can be different. The authority indicates that an object needs to be created and has a network ID, along with its serialized contents. When the authority destroys an object, it can just pass along the ID of the object to be destroyed. All the manipulation and updates to the entity would use the network ID, the game client can keep a mapping such as ID's to local object pointers.

marem83 said:
I would still need to mock the Network messages queue to unit test the NetworkMessageSystem, but that would not really be an issue as it'd be done in one place only.

Automated tests for network messages are usually quite straightforward replay tests, done as an integration test. Given a stream of data does a test object get created, and similarly, given a test object does an expected stream of data get generated. The thing being created is often an event that gets broadcast to any interested listening systems. Since the task creates objects it's typically far too slow to be a good unit test.

There are also tests I've seen in that layer that run tests in all non-final builds in addition to test runners. Probably the most useful is that whenever you serialize an object for the wire, your non-final release feeds the bytes into the deserialize function and ensures they compare equal. I've seen a surprisingly large number of bugs caught early that way, most commonly NaN failing the tests as they never compare equal, and also bugs for people failing to maintain their serialization scripts.

marem83 said:
How can this be accomplished without compromising clarity?

Treat network communications as an asynchronous serialized event bus. Create as many channels within that bus as you want by giving a channel ID. Generally there are three you need: Unreliable unordered are continuous status updates you don't care if they're lost, they'll almost always get through eventually. Reliable unordered your network system will keep retransmitting until acknowledged, they'll get there eventually but you're not picky about order. Reliable ordered is a data stream, any communications issues will block it until gaps are retransmitted but you know it will arrive in order.

Also, remember when writing the code that it can fail at any time for any reason, including reasons entirely outside your flow.

With those in mind, any system can use the event bus to send and receive messages as a once-off or as a continuous stream.

frob said:
While you linked to a bunch of articles, I'm not sure you've deeply understood what they say.

Indeed. I do understand a fair bit of these articles (I believe so, at least), however this is still mostly theoretical for me as I am implementing these concepts for the first time. It is likely that I misunderstood various aspects, but I'm a point where iterating on an implementation is necessary to grasp what I may have missed.

Before I continue answering, I would like to add a precision: my issue here is more related to binding netcode to an ECS architecture than client-server communication. This is related, obviously, but in other words my issue is about integrating ECS with client-server communication, rather than client-server communication itself.

frob said:
Can you explain more what you mean here?

For sure! Let's simplify a bit and assume a PlayerMovementSystem in the server. This PlayerMovementSystem would query all entities with various components such as colliders, transform, etc. but also a PlayerCommandsComponent, which contains a buffer of player commands for the past x client ticks (future simulation ticks on the server). On the other hand (still on the server), I start the WebSocket server more or less as follows (note that I removed several things including error handling to avoid cluttering the page, and using JSON for prototyping):

const wss = new WebSocketServer({ port: 8080 })

wss.on('connection', (ws) => {
    handleNewConnection(ws)
})

function handleNewConnection(connection: WebSocket) {
	// react to the new connection
	
    connection.on('message', (message) => {
    	const data = JSON.parse(message.toString())
    	// do something with the message
    })
    
    connection.on('close', () => {
    	// react to closed connection
    })
}

Put separately, these parts are relatively straightforward. What I am struggling with, is deciding how and where to channel these network messages to populate the PlayerCommandsComponent buffer. Commands would not be the only type of message I'm interested in, obviously, but this is a starting point.

I hope this clarify my question! ? I will answer your other points bellow.

frob said:
Most (but not all) network systems work best with state-based events. It is far less data to send a message “player 1 is now walking” than it is to send a message every simulation tick that position has been updated. The first done well generates a single message that can be a less than a byte, the second done badly can generate a kilobyte of updates every second.

I definitely agree with that, although it seems more related to optimization than architecture, as the ECS Systems use the same data (it's just the way client and server communicate that changes). I might be wrong here, so feel free to correct me.

frob said:
Creation and deletion of entities should happen relatively infrequently, but there are events for those as well. In many designs there are two ID's, one is the network ID and the other is the local ID, which can be different. The authority indicates that an object needs to be created and has a network ID, along with its serialized contents. When the authority destroys an object, it can just pass along the ID of the object to be destroyed. All the manipulation and updates to the entity would use the network ID, the game client can keep a mapping such as ID's to local object pointers.

This is clear, and I agree of course. This is what I am already doing. Each entity on the server has an entity id that is local to the server (same thing on the client), but each network entity gets assigned a network id (connectionId, although the name is subject to change) by the server. However, unless I am misunderstanding, this is more related to client-server communication than internal server code architecture.

frob said:
Treat network communications as an asynchronous serialized event bus. Create as many channels within that bus as you want by giving a channel ID. Generally there are three you need: Unreliable unordered are continuous status updates you don't care if they're lost, they'll almost always get through eventually. Reliable unordered your network system will keep retransmitting until acknowledged, they'll get there eventually but you're not picky about order. Reliable ordered is a data stream, any communications issues will block it until gaps are retransmitted but you know it will arrive in order.

Interesting! This is an area I need to dig into as I am quite inexperienced with UDP implementations.

Thank you for your precious feedback! ?

None

Regarding binding network subsystem to entities:

First, there is only one network, in the same way that there is only one “fetch” implementation and only one DOM – it's a physically limited, singleton concept.

I assume that you also have only one scene graph, and only one simulation world (for collision and such.) The network is a singleton just like those.

Second, the way to route messages, is typically to use some kind of addressing. If every entity has a unique ID, and every component within the entity has a unique-to-the-entity ID, then a message can be routed to the right entity and component simply with the path “entity-id, component-id.” Thus, the framing of each message should contain that tuple, and then the innards of each message only needs to be known by the component receiving it.

Third, the connection between an entity and the network, could be an entity responsibility, or could be a special component you attach to an entity, that knows how to find the entity and its components for purposes of networking. However, given that networking affects pretty much everything in a game, and in gameplay, every component and entity type that wants to have good behavior on a network, needs to “know" about networking in the abstract (do I interpolate? do I correct? Which properties do I have which replicate? Do I need serial consistency, or is eventual consistency sufficient? and so on.) Thus, it might be reasonable to say that every component will just talk to networking directly, although they still have the address “entity-id, component-id.”

Finally, for efficiency, you'll typically sort updates by entity first, and by component ID second, so if the outgoing queue has messages for:

  • Entity 3, component A
  • Entity 2, component A
  • Entity 2, component C
  • Entity 3, component A
  • Entity 3, component B
  • Entity 2, component C

The actual queue in the packet / on the wire will likely look something like:

  • 2 entities
    • 2 components for entity 2
      • 1 message for component A
        • the message
      • 2 messages for component C
        • the message
        • the message
    • 2 components for entity 3
      • 2 messages for component A
        • the message
        • the message
      • 1 message for component C
        • the message

I agree with the advice that sending input events and detected mis-matches as seldom-sent events, generally performs better than sending snapshots every frame. You should, however, try to send snapshots in the background on some schedule, as a fall-back to fix up whatever state may end up becoming de-synced. Unless you go all the way and implement fully deterministic replay with command queueing (like RTS games) – but that's very hard on top of JavaScript, because math is way under-specified, and not consistent across architectures, or even across browsers.

enum Bool { True, False, FileNotFound };
Advertisement

H+ has a good reply, all those points are great.

Addressing this point specifically:

marem83 said:
What I am struggling with, is deciding how and where to channel these network messages to populate the PlayerCommandsComponent buffer.

Generally gameplay systems work better as events here. Think of them less in terms of function calls, more in terms depending on their type: They're either packages to deliver though the post, or notices posted to a bulletin board, or announcements made in a newspaper. One side posts information for the other side to process, with no expectation that it will be rapid, no expectation that they'll respond (unless that's part of your protocol), and for some message types with no expectation that they'll even see it at all. Messages addressed to specific objects can work, and broadcast messages sent to any registered listener can also work.

However, the networking system is a different beast than gameplay systems. Much like FedEx delivering boxes, the networking system doesn't care what's in any message. It's job is to take a block of data from one side and deliver it somewhere else. A message is just an arbitrary block of data being directed to an object, and the object happens to be located somewhere else. Data channels help with organizing to see what may need to be retransmitted or what needs to be delayed or help in maintaining quality of service, but ultimately the messages going through are just blocks of arbitrary data getting transmitted elsewhere. The networking subsystem typically doesn't care about the message content, only delivery.

You can have a system that transmits or re-transmits world state, or game state, or synchronization information, or pictures of players avatars, or text chat, or voice chat streams, or whatever. Those other systems don't really matter to the networking system. Any system can send packages through the network addressed to an object on the other side. What the other side does with it is specific to the game. In major engines like Unreal there are a bunch of built-in communications to keep worlds state in sync, physics state in sync, and also the ability to send notifications across the wire, but you're free to create any additional messages you want, and to process any incoming messages however you wish.

Your PlayerCommandsComponent would be expecting a message, and have a message handler. When the message handler is called with a message, it should validate whatever message was sent to it, ensuring it is an expected size and values are within expected ranges, and then do whatever the component wants with it. You seem to have a payload with key/value pairs like moveRight=true and fire=true, so if that's what the component is expecting it should use them.

marem83 said:
connection.on('message', (message) => {
const data = JSON.parse(message.toString())
// do something with the message
})

In this case, “do something with the message” is simple delivery. Hopefully you've got validation to sanity check the message, such as ensuring it's actually a string and can actually parsed as JSON, and actually the size you're expecting, but for an example it's tolerable. Find the recipient. If you find the recipient deliver the message, and let the other system do with it whatever it needs. If there is an error with the message size, an error parsing the data, or the game can't find the intended recipient, or any other error, flag it as an error then drop the message. Just like FedEx, drop the package at the door and move on to the next.

The networking system shouldn't care about contents, leaving it up to the receiver can interpret it how it wants. If that message block happens to have data about walking, or about firing, or about changing the color of whatever else the message is about, the object that received the message would need to interpret the results and do something sane with the message block. That's beyond the responsibility of the networking system.

@hplus0603 @frob Hey there!

I just went through both of your messages. There's obviously precious insight in both, but I believe I will have to iterate on the implementation to consolidate my understanding. I imagine I could question or challenge some aspects, but that'd require a hindsight I don't have yet. In any case, that exchange was definitely helpful to unblock me. ?

Thanks a lot for taking the time to answer, I really appreciate it! It's gonna take a while and several projects, but hopefully some day I can give back to the community with my own experience!

Kind regards,
marem

None

This topic is closed to new replies.

Advertisement