Skip to content

Application Level Protocols

Application protocol design is a complex subject that extends beyond the scope of this guide. Instead, we will concentrate on the constraints imposed by Aeron Cluster on communication and explore how application protocols can be best adapted for this purpose.

Partially Synchronous

Aeron Cluster is built upon Raft, a partially synchronous consensus algorithm. Aeron Cluster nodes and clients continuously interact to ensure that all processes are currently alive and responsive within specific time windows. In this way, the cluster and cluster clients are known to be operational and responsive. This may lead you to expect that Aeron would deliver near-synchronous (REST-like) interactions. In fact, application level messages sent to clients from the cluster are entirely driven by application logic. It is the application logic within Aeron Cluster that will determine when - and if - to respond to any given message. This approach enables far more natural interactions. As an example, if we had an auction that closed at 4pm, we could schedule a timer for 4pm after creating the auction at 9am, and then emit an 'auction-closed' message at 4pm. Aeron Cluster Clients do not need to have requested or submitted a message to receive a message from the cluster.

This design contrasts with other approaches like Kafka or REST. In Kafka, without an additional application protocol in place, applications tend to attempt (and possibly verify) delivery to the Kafka Broker, and then stop worrying about the message. There is no time coupling enforced by Kafka between producers and consumers. In REST APIs, applications can expect a response from the server as soon as the request has completed processing. Both of these approaches can lead to technology-driven protocol changes that result in unnatural additions to the application logic.

General Notes

Detecting loss

For safe operation, Aeron Cluster Clients should retain some state. This is particularly important in failure-related edge cases, such as the loss of a cluster leader node at the moment a cluster client sends a message. It is recommended to apply unique correlation identifiers, such as GUIDs, Snowflake IDs, or similar to each message, and add message tracking logic. This approach better detects message loss and enables the client to make informed decisions based on the particular messages lost. These same unique identifiers can be used as part of a message tracing infrastructure.

Cluster Client Sessions

The onSessionMessage delivers messages to the implemented ClusteredService with an included ClientSession object. It is important to note that should the same client process connect, then reconnect following a network loss, then the client will have a new ClientSession. Any data held to track specific cluster clients must take this into account. As a result, either each client application should either add an on-connect protocol to introduce itself to the cluster, or include specific client identification information on each message.

Protocol Encoding

Aeron Cluster imposes no restriction on the application level protocol on the encoding technology used, however, some encoding techniques can be significantly more computationally expensive than others. If the system you are delivering does not have strict performance requirements, then almost any serialization technology — such as JSON, Protobuf etc. — will be suitable. If however, the system you are delivering has very strict performance requirements, then encoding technologies such as Simple Binary Encoding which offer zero-copy will be necessary.