Beginning around 2007, Martin Sustrik began working on a new messaging system. Sustrik had worked on OpenAMQ and the AMQP working group, and along with a small team of specialists and with support from iMatix, decided to build something aimed to improve on the existing AMQP work. Based on his experience within the AMQP project, and the burgeoning brokerless message systems, the team decided to create a brokerless messaging system with AMQP compatibility that would be suited for high-speed market data distribution. As the product developed the focus changed, and the project became the LGPL open source messaging library ZeroMQ.
“Mainframes got much of their power from clever messaging, transaction processing systems like IBM CICS. But today even 1980’s-standard middleware - unlike databases, operating systems, compilers, editors, GUIs, and so on - is still not widely available to ordinary developers. The software industry is producing various business applications and pieces of applications, and the tools to make these, in ever greater quantities, and ever lower prices, but the messaging bit is still missing. The lack of a way to connect these applications has become not just an unconquered terrain, but also a serious bottleneck to growth, especially for new start-ups that could in theory compete aggressively with larger, older firms, if they were able to cheaply combine existing blocks of software.”
- Martin Sustrik1
ZeroMQ is a powerful and interesting library, both for the design decisions taken, and because of how it has played out into one of the most widely used brokerless messaging systems. While other brokerless systems are almost certainly used in large networks individually, the high cost and relatively scarcity of individual deployments means its difficult to see these systems at work and extract lessons from them. The openness and availability of ZeroMQ has lead to its deployment everywhere from financial services, to cloud system management, to web servers and much more.
Brokerless messaging played well with developers and designers of smaller applications as it removed the need to maintain the broker. Because brokers do a lot of different things, they tend to come with equally complex administration requirements, and a need for support and configuration. ZeroMQ was distributed just as a library, without any kind of service or daemon to start. This made it an easy way for teams to experiment with and gain the benefits of messaging when they had a need to communicate between different systems, but not the resources for an extra component in their architecture. ZeroMQ was one of the major tools that accomplished, and popularised, that approach.
“What we need is something that does the job of messaging but does it in such a simple and cheap way that it can work in any application, with close to zero cost. It should be a library that you just link with, without any other dependencies. No additional moving pieces, so no additional risk. It should run on any OS and work with any programming language.”
- Pieter Hintjens2
One of the most interesting design decisions in ZeroMQ was to simplify the API to be similar to the existing BSD socket API. The BSD, and the derived POSIX, API is the standard way of interacting with the network in most modern operating systems. Each side of a connection was a socket, which was either connecting out to a service, or binding to an interface (a network card, or location on the machine itself) and listening for incoming connections. This simple programming model of bind(), connect(), send() and recv(), along with some socket creation options was familiar to many existing network developers. More so than most messaging APIs, the ZeroMQ API was built with the express intention of being easy and familiar to use, driven by people like Sustrik, Martin Lucina and Pieter Hintjens.
Actual network programming with the BSD sockets API is also reasonably straightforward, but pushes a lot of responsibility onto the programmer to determine. Developers are expected to check the proper length of sent data, manage the separation of individual chunks of information, detect failed connections, and wait for connection establishment. ZeroMQ provided similar APIs, but in a more intuitive way. The library took care of establishment, delivery, reconnections and atomic transmission of messages - either the other side got the whole message, or it wasn’t notified at all. ZeroMQ also provided queues on both sides of a connection to buffer messages being delivered, with size controls to protect producers from slow consumers.
The simplicity was echoed in the functionality of the library itself - messages were opaque blobs of data, with any structure being defined at the application level. This ethos was also reflected in the standard design advice - be Internet like, smart edges, dumb core. ZeroMQ topologies are networks of small components that distribute messages as required to the eventual endpoints.
Another interesting idea in ZeroMQ is its flexibility in terms of what protocols it runs over. Rather than be tied to certain transports, the library allows sending ZeroMQ messages over OpenPGM multicast, TCP sockets across a network, IPC sockets within a single machine or inproc (in process) between two threads in a single process. At a higher level, the design of ZeroMQ was such that it was natural to layer other protocols above it, allowing users to create their own systems which provide precise application semantics while relying on the functionality provided by lower layers. By defining its message type as opaque blobs of data, formats such as JSON, MSGPack, Protocol Buffers and more have been used to create next layer protocols. These range from application specific tools to generic frameworks such as Pieter Hintjens’ Majordomo protocol for reliable messaging or Paul Colomiets’ Extensible Statistics Transmission Protocol for distributed monitoring transport.
Messaging Patterns
One of the key design decisions that lifted ZeroMQ above a simple peer-to-peer message passing library was around how it treated messaging patterns. Blending approaches from JMS, where the API defined pub/sub and point to point, but not how to implement them, and AMQP 0.91, where the specification offered the components of patterns, not how to build them, ZeroMQ defines some fundamental messaging patterns, and assumes more complex functionality will be layered on top.
These patterns fundamentally revolve around the concepts of fan-in, fan-out, and pub/sub. ZeroMQ exposed DEALER and PUSH sockets, which offered the ability to push messages to a load balanced set of endpoints, PULL that could fair-queue between a number of streams of incoming messages, ROUTER that allowed sending messages to specific connected parties via a key, and PUB and SUB for the two sides of a pub/sub distribution, with filtering of messages based on prefix matches. The use of messaging patterns have subsequently been echoes in several projects, notably Martin Sustrik’s Nanomsg3, and projects like Axon4.
The PUSH/PULL pattern was used for one directional scatter/gather type pipelines, where a number of points fan-out to more nodes, then fan-in to fewer. DEALER and ROUTER were flexible, bidirectional general message sockets generally used for building more complex patterns, and PUB/SUB was used for the distribution to groups, with filtering.
The pub/sub pattern implemented flexible prefix matching, where the first part of the message is matched to a list of stored prefix based subscriptions. A subscriber was able to register any number of prefixes to subscribe to. This was efficiently done through the use of a trie data structure. This technique allowed a significant improvement in latencies even in the face of many subscriptions and many messages.
If more complex filtering or routing was required additional hops could be introduced at low cost with what were called devices or proxies - in a large part to distinguish them from brokers or other heavy weight services. These small applications would pipe messages from one connection to another, and at that point had the ability to add extra layers of filtering, with custom application code doing whatever work was required to transform and emit messages.
This combination of a performant message passing library and effective messaging patterns made it possible to implement complex functionality from relatively few components, and has made ZeroMQ a popular choice for gluing together many types of distributed systems. Some of its biggest users came from a whole new group of developers who were approaching messaging again from ground zero.