As a developer, I have used many of the most common API technologies, such as XML SOAP, REST, GraphQL, TrPC. All of them have known pros and cons, and the rest of the tech stack often lays the premises for what fits best. But common to all of them is that they work best in a standard client-server architecture where one has a familiar server that answers the requests that connected clients request on a one-to-one basis. But under other circumstances and with other challenges, completely different combinations of technologies can work even better.
First, a small explanation of concepts. MQTT is a network protocol primarily used for IoT. It is based on a publish-subscribe mindset where one sets up an exchange center (eng: broker) to which clients connect. The clients send only to this exchange, which in turn will distribute copies of this message to all the clients it knows of who have subscribed to such messages. The way the dispatcher knows who to send what to is via so-called topics. All messages sent must have a topic, and when setting up a subscription it applies to one or a pattern of topics. With the help of wildcards, you can create a grid that is as wide or fine-meshed as you want on the messages you want to receive.
In this way, it will be possible to have almost infinitely many transmitters and receivers that can exchange data without having to know each other. Through the topics with which one subscribes or identifies the messages, only those recipients who wish to receive a message will receive it. Here, too, the clients can be very different, some may only want to send data, some will both send and receive data, and some will only receive data. Someone will want to receive all types of data, while someone is only interested in a very pointed subset.
An imaginary example might be in a smart home, where one can have the following setup:
- Simple temperature sensors inside and outside, transmitting only data
- Thermostats on heaters that can both receive control signals, and which will send data about their temperature and condition
- A display that displays information about indoor and outdoor temperatures, which will receive only the data transmitted from temperature sensors and thermostats
- A tablet set up to control the entire smart home, which can receive temperatures and can control thermostats, as well as control all other devices one has connected.
When you operate with so many different types of clients that you can connect to, you run the risk of getting bogged down with data models. Some clients may have code written in JavaScript, some in Python, some in C++ or C. Maybe you have an app that uses Swift, Android, or Flutter. Changing a data model in one place will have ripple effects on everyone else using those messages, and you run the risk of having to implement the same change in a handful of different languages and retest all API calls. This is where Protobuf comes in.
Google came up with the concept of protocol buffers (Protobuf) as its solution to this problem. It is a separate, domain-specific language for computer models, which can recall a minimum common multiple of many of the most popular languages. It is designed to serialize data for transmission over a network, and work crosswise between different programming languages. You can generate models in most programming languages from a common definition in a .proto-format. As of today, there is official support for C++, Java, Go, Ruby, C# and Python, as well as functioning third-party libraries for C, Haskell, Kotlin, OCaml, PHP, Swift, Rust, Zig and TypeScript, among others. In this way, one can be assured that a model that was serialized in one language will work when deserialized elsewhere. At the same time, the serialized data is designed to be as compact as possible, which is quite useful when operating with small, limited units. Let's continue the example from earlier:
Let's say that today we have temperature sensors that send simple updates containing only how many degrees it measures. Then the .proto-file will look something like this:
Here “int32" refers to the data type, “degrees” is the name of the variable, and 1 is the field's identifier, which is used in the single-encoded binary format.
After a while, you may want to insert a couple of new sensors from a supplier that you have not used before. After they are assembled, it turns out that these measure temperature in fahrenheit rather than celsius. To save the most on your batteries, you don't want the device itself to do the conversion. What one can do is extend the model to include information about the temperature being transmitted is celsius or fahrenheit. Here, for example, one can add a boolean value that is threatening if the temperature is in fahrenheit. The model then becomes like this:
With this model, the new sensors will be able to send their temperature in fahrenheit and set the isFahrenheit flag to true, and the older ones will be able to send with the flag set to false. The new models generated provide compile-time type safety in everything from an embedded sensor software written in C to the control app written in Swift based on the same type definition. This is incredibly useful as MQTT itself does not restrict or validate the messages sent out, but as long as one knows that everyone who wants to send or receive messages relates to the same version of the Protobuf models, one can be assured that the data sent is received correctly on the other side.
💡 In the name of pragmatism, one could also consider setting the isFahrenheit field to optional, as this adds up to not being forced to update devices where this flag is not relevant, but I am not particularly fond of optional Boolean values as in practice this gives three possibilities instead of two.
If this system had been set up with a more traditional REST API with JSON-encoded data, there would have been two challenges. The first is that one has to go through all the various system components that can send or receive temperature changes and consider whether to implement the new change in the model objects in this project or whether to bet that one can allow two nearly identical models to live side by side. You may end up having to update your API. The second is to ensure that all integrations work, and update code and documentation with the new sensors that one has inserted. As our system does not have explicit endpoints and we can rely on generated models, MQTT and Protobuf will do this job significantly less.
For someone who has worked almost exclusively with client-server, and who has lately been excited about the end-to-end type security one gets with TrPC, it has been exciting to see a completely different solution with many of the same benefits. MQTT, in particular, is a protocol that is best in environments where the distinction between client and server is partly blurred, and new nodes can come and go at random times. If guaranteed delivery in such an environment is important, one can set a desired QoS level on each message, where the broker will ensure either that each message is delivered either at least once (level 1) or exactly once (level 2). Level 1 requires the recipient to answer each message with an acknowledgement, and level two requires a four-way handshake, so in cases where the lowest possible use of bandwidth or battery is important, consideration must be given to whether these guarantees can be justified.
The total package these two protocols can offer is certainly specious for systems where you have many different components and nodes that do not necessarily need to know about each other, but where you want type safety, delivery guarantees and that will work in the vast majority of major programming languages.