What is MQTT?

The MQTT protocol provides a scalable and cost-effective way to connect devices over the Internet.
The use of MQTT focuses on delivering data over the Internet in near real time with predefined delivery guarantees.
Its purpose is to connect millions of IoT devices to a company’s infrastructure, send instant updates and move data efficiently.

MQTT can connect millions of IoT devices to a company's infrastructure, send instant updates and move data efficiently.

IoT

MQTT basics

MQTT is a publish / subscribe server / client message transport protocol.

It is lightweight, open, simple, and designed to be easy to implement. These features make it ideal for use in many situations, including restricted environments, such as communication in machine-to-machine (M2M) and Internet of Things (IoT) contexts where a small code footprint and / or network bandwidth is very important.

First of all, we will explore the basic concepts (publish / subscribe, client / broker) and the basic functionality (Connect, Publish, Subscribe) of the MQTT protocol.
Then, we will see the data transport features such as Quality of Service (QoS), Retained Messages, Persistent Session, Last Will and Testament, Keep Alive among others.

1 Publish & Subscribe

The publish / subscribe pattern (also known as pub / sub) provides an alternative to the traditional client-server architecture.
In the client-server model, a client communicates directly with an endpoint. The pub / sub model decouples the client that sends a message (the publisher) from the client or clients that receive the messages (the subscribers).
Publishers and subscribers never directly interact with each other. In fact, they are not even aware that the other exists. The connection between them is handled by a third component (the Broker).
The Broker’s job is to filter all incoming messages and distribute them correctly to the subscribers.

 

1.1 Publish / Subscribe Architecture

The most important aspect of pub / sub is the decoupling of the publisher from the recipient’s message (subscriber). This decoupling has several dimensions:
Space decoupling: The publisher and subscriber do not need to know each other (for example, IP address and port are not exchanged).
Time decoupling: publisher and subscriber do not need to run at the same time.
Synchronization decoupling: No need to interrupt operations on both components during publish or receive.

 

1.2 Summary

  • MQTT spatially decouples the publisher and the subscriber. To post or receive messages, publishers and subscribers only need to know the hostname / IP and the port of the broker
  • MQTT is decoupled by time. Although most MQTT use cases deliver messages in near real time, the BROKER can store messages for clients that are not online if desired. (Two conditions must be met to store messages: the client connected to a persistent session and subscribed to a TOPIC with a Quality of Service (QoS) greater than 0).
  • MQTT works asynchronously. Because most client libraries operate asynchronously and are based on callbacks or a similar model, TASKS do not crash while waiting for a message or posting a message. In certain use cases, synchronization is desirable and possible. To wait for a certain message, some libraries have synchronous APIs. But the flow is usually asynchronous.

Another thing that should be mentioned is that MQTT is especially easy to use on the client side. Most pub / sub systems have logic on the BROKER side, but MQTT is really the essence of pub / sub when using a client library and that makes it a lightweight protocol for small and restricted devices.
MQTT uses TOPICS-based message filtering. Each message contains a TOPIC (subject) that the BROKER can use to determine whether or not a subscriber client receives the message.

To handle the challenges of a pub / subs system, MQTT has three levels of Quality of Service (QoS) that we will address in Section 4 of the MQTT Publish, Subscribe & Unsuscribe module.

2 Client, broker & connection

Because MQTT decouples the publisher from the subscriber, client connections are always handled by a broker. Below we detail in more detail the communication between client and broker.

 

2.1 Previous considerations

Because MQTT decouples the publisher from the subscriber, client connections are always handled by a broker. Below we detail in more detail the communication between client and broker.

 

2.2 Client

When we talk about an MQTT client, both publishers and subscribers are MQTT clients. The publisher and subscriber tags refer to whether the client is currently publishing or subscribing to messages (the publish and subscribe functionality can also be implemented on the same MQTT client).
Therefore an MQTT client is any device (from a microcontroller to a full server) that runs an MQTT library and connects to an MQTT broker over a network.

 

2.3 Broker

The counterpart component of the MQTT client is the MQTT broker. The broker is at the heart of any publish / subscribe protocol. Depending on the implementation, a broker can handle up to thousands of connected MQTT clients simultaneously.
The broker is responsible for receiving all messages, filtering the messages, determining who is subscribed to each message, and sending the message to these subscribed clients. The broker also contains session data for all clients that have persistent sessions, including subscriptions and lost messages. Another responsibility of the broker is the authentication and authorization of the clients. The broker is generally extensible, making it easy to customize authentication, authorization, and integration into back-end systems.
Integration is particularly important because the broker is often the component that is exposed directly on the Internet, handles many clients, and needs to pass messages to the post-analysis and processing systems. As discussed in a previous post, subscribing to all messages is not really an option. In short, the intermediary is the central hub through which each message must pass.
Therefore, it is important that your broker is highly scalable, integrable into back-end systems, easy to monitor and of course fault resistant.

 

2.4 MQTT Connection

The MQTT protocol is based on the TCP / IP family of Internet protocols. Both the client and the broker must have a TCP / IP stack.

The MQTT connection always occurs between a client and the broker. Clients never connect to each other directly. To initiate a connection, the client sends a CONNECT message to the broker. The broker responds with a CONNACK message and a status code. Once the connection is established, the broker keeps it open until the client sends a disconnect command or the connection is interrupted.

3 Publish, subscribe & unsubscribe

 

3.1 Publish

An MQTT client can post messages as soon as it connects to a broker. MQTT uses topic-based filtering of messages on the broker. Each message must contain a topic that the broker can use to forward the message to interested clients. Typically, each message has a payload that contains the data to be transmitted in byte format. MQTT is independent of the data. The client determines how the payload is structured. The issuing client decides whether to send binary data, text data, or even full XML or JSON. A PUBLISH message in MQTT has several attributes that we want to analyze in detail:

Topic Name – The topic name is a simple string that is hierarchically structured with slashes as delimiters. For example, “mihome / livingroom / temperature” or “Germany / Munich / Octoberfest / people”.
QoS – This number indicates the Quality of Service (QoS) level of the message. There are three levels: 0,1 and 2. The service level determines what type of guarantee a message has to reach the recipient (client or broker).
Retain Flag – This flag defines whether the broker saves the message as the last known valid value for a specific topic. When a new customer subscribes to a topic, they receive the last held message on that topic.
Payload – This is the actual content of the message. MQTT is independent of the data. You can send images, text in any encoding, encrypted data, and virtually all data in binary.
Packet Identifier – The packet identifier uniquely identifies a message as it flows between the client and the broker. The packet identifier is only relevant for QoS levels greater than zero. The client library and / or the broker are responsible for setting this internal MQTT identifier.
DUP flag – The flag indicates that the message is a duplicate and was forwarded because the recipient (client or broker) did not recognize the original message. This is only relevant for QoS greater than 0. Generally, the forwarding / mirroring mechanism is handled by the MQTT client library or broker as an implementation detail.

When a client sends a message to an MQTT broker for publication, the broker reads the message, recognizes it (according to the level of QoS), and processes the message. Processing by the broker includes determining which clients have subscribed to the topic and sending them the message.

The customer who initially publishes the message is only concerned with delivering the PUBLISH message to the broker. Once the agent receives the PUBLISH message, it is the agent’s responsibility to deliver the message to all subscribers. The publishing client does not receive any comments about whether someone is interested in the published message or how many clients received the message from the agent.

 

3.2 Subscribe

Posting a message doesn’t make sense if nobody receives it. In other words, if there are no clients to subscribe to the topics of the messages. To receive messages on topics of interest, the client sends a SUBSCRIBE message to the MQTT Broker. This subscription message is very simple, it contains a unique package identifier and a list of subscriptions.


MQTT Subscribe attributes
Packet Identifier – The packet identifier uniquely identifies a message as it flows between the client and the broker. The client library and / or the broker are responsible for setting this internal MQTT identifier.
List of Subscriptions – A SUBSCRIBE message can contain multiple subscriptions for a customer. Each subscription is made up of a topic and a QoS level. The topic of the subscription message can contain wildcards that allow you to subscribe to a topic pattern instead of a specific topic. If there are overlapping subscriptions for a client, the broker delivers the message that has the highest QoS level for that topic.

3.3 Suback

Para confirmar cada suscripción, el intermediario envía un mensaje de confirmación SUBACK al cliente. Este mensaje contiene el identificador de paquete del mensaje de suscripción original (para identificar claramente el mensaje) y una lista de códigos de retorno.

Packet Identifier – The packet identifier is a unique identifier used to identify a message. It is the same as in the SUBSCRIBE message.

Return Code – The broker sends a return code for each topic / QoS-pair that it receives in the SUBSCRIBE message. For example, if the SUBSCRIBE message has five subscriptions, the SUBACK message contains five return codes. The return code recognizes each topic and shows the level of QoS that the broker grants. If the broker rejects a subscription, the SUBACK message contains a failure return code for that specific topic. For example, if the client does not have sufficient permissions to subscribe to the topic or the topic is poorly formed.

Return Code Return Code Response
 

0

 

 

Success – Maximum QoS 0

 

1

 

 

Success – Maximum QoS 1

 

2

 

 

Success – Maximum QoS 2

 

128

 

 

Failure

After a client successfully sends the SUBSCRIBE message and receives the SUBACK message, it gets all published messages that match a topic in the subscriptions that contained the SUBSCRIBE message.

 

3.4 Unsuscribe

To confirm the cancellation of the subscription, the broker sends a confirmation message from UNSUBACK to the client. This message contains only the packet identifier of the original UNSUBSCRIBE message, to clearly identify the message.

Packet Identifier – The packet id uniquely identifies a message as it flows between the client and the broker. The client library and / or the broker are responsible for setting this internal MQTT identifier.

List of Topics – The list of topics can contain multiple topics from which the customer wishes to unsubscribe. It is only necessary to send the topic (without QoS). The broker unsubscribes from the topic, regardless of the level of QoS with which they originally subscribed.

 

3.5 Unsuback

To confirm unsubscription, the broker sends a confirmation message from UNSUBACK to the client. This message contains only the package identifier package from the original UNSUBSCRIBE message (to clearly identify the message).

Packet Identifier – The packet id uniquely identifies the message. As already mentioned, this is the same package identifier that is in the UNSUBSCRIBE message.

After receiving the UNSBACK from the broker, the client can assume that the subscription in the UNSUSCRIBE message is removed.

4 Topics and good practices

In MQTT, the word topic refers to a UTF-8 string that the broker uses to filter messages for each connected client. The topic consists of one or more topic levels. Each topic level is separated by a forward slash (topic level separator).

Compared to a message queue, MQTT topics are very light. The client does not need to create the desired topic before publishing or subscribing to it. The broker accepts each valid topic without any initialization.
Here are some examples of themes:

myhome/groundfloor/livingroom/temperature

USA/California/San Francisco/Silicon Valley

5ff4a2ce-e485-40f4-826c-b1a5d81be9b6/status

Germany/Bavaria/car/2382340923453/latitude

Note that each topic must contain at least 1 character and that the topic string allows empty spaces. Topics are case sensitive. For example, _myhome / temperature and _MyHome / Temperature are two different topics. Also, the forward slash alone is a valid topic.

 

4.1 Wildcards

When a customer subscribes to a topic, they can subscribe to the exact topic of a posted message, or they can use wildcards to subscribe to multiple topics simultaneously. A wildcard can only be used to subscribe to topics, not to post a message. There are two different types of wildcards: _single-level and _multi-level.

 

4.2 Single level: +

As the name implies, a single-level wildcard replaces a topic level. The plus symbol represents a single-level wildcard in a topic.

Any topic matches a topic with a single level wildcard if it contains a random string instead of the wildcard. For example, a subscription to _myhome / groundfloor / + / temperature can produce the following results:

 

4.3 Multi level: #

The multi-level wildcard covers many levels of topics. The hash symbol represents the multi-level wildcard in the topic. In order for the broker to determine which topics match, the multi-level wildcard must be placed as the last character in the topic and must be preceded by a forward slash.

 

When a customer subscribes to a topic with a multi-level wildcard, they receive all messages from a topic that begins with the pattern before the wildcard character, no matter how long or deep the topic is. If you specify only the multilevel wildcard as the subject (_ #), you will receive all the messages that are sent to the MQTT broker. If you expect high performance, subscription only with a multi-level wildcard is an anti-pattern

Topics starting with $

In general, you can name your MQTT topics as you like. However, there is one exception: Topics that start with a $ symbol have a different purpose. These topics are not part of the subscription when you subscribe to the multi-level wildcard as topic (#). Topics $ -symbol are reserved for internal statistics of the MQTT broker. Clients cannot post messages on these topics. At the moment, there is no official standardization for such topics. Commonly, $ SYS / is used for all of the following information, but broker implementations vary. A suggestion for $ SYS-topics is on the MQTT GitHub wiki. Here are a few examples:

$SYS/broker/clients/connected

$SYS/broker/clients/disconnected

$SYS/broker/clients/total

$SYS/broker/messages/sent

$SYS/broker/uptime

 

4.4 Good practices

1. Never use a forward lean bar
A forward slash is allowed in MQTT. For example, / myhome / groundfloor / livingroom However, the forward slash introduces an unnecessary topic level with a zero character in front. Zero provides no benefit and often creates confusion.
2. Never use spaces in a Topic
As with forward sway bars, just because something is allowed doesn’t mean it should be used. UTF-8 has many different types of white space, so these rare characters should be avoided.
3. Keep the Topic brief and concise
Each topic is included in every message in which it is used. Make your topics as brief and concise as possible. When it comes to small devices, every byte counts and the length of the theme has a big impact.
4. Use only ASCII characters, avoid non-printable characters
Because non-ASCII UTF-8 characters are often displayed incorrectly, it is very difficult to find any typos or character set related issues. Unless absolutely necessary, we recommend avoiding the use of non-ASCII characters in a theme.
5. Embed a unique identifier or customer ID in the topic
It can be very useful to include the unique identifier of the publishing client in the topic. The unique identifier in the topicle helps identify who sent the message. The embedded ID can be used to enforce authorization. Only a client who has the same client ID as the ID in the topic can post to that topic. For example, a customer with the ID _client1 can post to _client1 / status, but cannot post to _client2 / status.
6. Don’t subscribe to #
Sometimes it is necessary to subscribe to all messages that are transferred through the broker. For example, to persist all messages in a database. Do not subscribe to all messages in a broker using an MQTT client and subscribing to a multi-level wildcard. Frequently, the subscriber client cannot process the message load that results from this method (especially if it has massive performance). Our recommendation is to implement an extension in the MQTT broker. For example, with the HiveMQ plugin system you can hook into HiveMQ behavior and add an asynchronous routine to process each incoming message and keep it in a database.
7. Don’t forget about extensibility
Topics are a flexible concept and there is no need to pre-assign them in any way. However, both the publisher and the subscriber must know the topic. It is important to think about how topics can be expanded to allow for new features or products. For example, if your smart home solution adds new sensors, it should be possible to add them to your topic tree without changing the entire topic hierarchy.
8. Use specific, not general, themes
When naming themes, don’t use them the same way as in a queue. Differentiate your topics as much as possible. For example, if you have three sensors in your living room, create topics for _myhome / livingroom / brightness and _myhome / livingroom / humidity. Do not send all values ​​on _myhome / livingroom. The use of a single topic for all messages is an anti-pattern. The specific nomenclature also allows you to use other MQTT functions, such as held messages. To know more about message retention consult HiveMQTT Part 8

Sources:www.hivemq.comwww.mosquitto.orgwww.steves-internet-guide.com