So, the MQTT exercises continue. Now that I've established the basic topology. Let's throw together a simple protocol to make it all work!!
We'll call it MQTT-rp:
Okay, what's the purpose here? We need a protocol to allow a combined MQTT server & client to become a Message Router without changing the MQTT protocol they speak.
Well, unlike almost any other routing protocol on the planet, this one will completely live in the application layer, but that gives us some options. We've basically got 2 different data spaces we can play with. We have the topic or the message/payload field in which we can place our special routing codes.
Since MQTT is already built to "route" the message based on the content of the topic field, we should use that to our advantage and keep all direct routing codes in the topic field.
First things first, our MQTT-Routers will need a way to contact/find each other, so the protocol will have to hinge on at least 1 primary network node which will host a channel that distributes connect info to all other Routers. All other authenticated routers will periodically(less than 30sec) post up their info to this channel, timestamped at the point of origin so the subscribed routers can calculate the latency on that route.
Router Hello Message Format:
Topic: "Router-Hello"
Hello Message: 4Byte Router ID(IP Address), 4Byte Origin Time Stamp, + 8Bytes for each hop(ip & timestamp)
Time Message: 4bytes '0000', 4 byte time stamp.
Also on the Router-Hello channel the Primary Node will publish a regular time code so each Router can synchronize its clock.
Okay, we've established that all routers will subscribe to the "Router-Hello" channel of their "Primary Node" and this router/channel will distribute connection and time information around the network just by being an MQTT channel. Very nice. Now that we know where everybody is and approximately how much time it takes THEM to get to the primary node, let's do some proactive routing table building and start pinging each node for latency information. In order to do this, we'll need to establish a return path for data to follow, preferably one that doesn't route everything off to the primary node or something silly first.
Each Router will establish a channel for each other known router on the network, with a Topic that matches the other Routers ID. And each Router will subscribe to ALL of its named channels. These channels will be used to route messages and to return pings for latency testing. When a Router needs to test its latency to another router, all it has to do is send a message with it's own ID as the topic.
Okay, so that lets us get the latency numbers for all of the direct hops on the network. Now what about actual routing path latency and actual message routing?
Routed Message Format:
Topic: 4byte destination router ID
Message: For Delivery To 4byte ID | 1byte topic length | 4bytes message length | Delivery Topic | Delivery Message | Tracking Information(+ 8Bytes for each hop(ip & timestamp))
Now, we know what our routed message looks like, how does that help us figure out an actual routing table and latency? The "For Delivery To" field will give us that. This field allows us to send a message addressed to a specific next-hop router, with instructions on where to send it immediately afterwards. If we also set the "Delivery Topic" to the source router ID this gives us the ability to route a message across a second hop before it returns back to origin. And now we can start doing some more useful routing table construction.
So, each router will maintain a sorted list of some type that contains the latency to each other router, and the assisted path latency to other routers via every available next hop Router. Okay, now our Routers can make some educated guesses as to the best next hop to send messages to that are addressed for other routers.
... with this information I think I can begin testing.
Thanks for following along.. I'll update as soon as I have some results/code to share.