Tuesday 18 November 2008

A brief introduction to the service integration bus

When Graham started this blog in September there was a definite idea that we would be talking about what is new in v7. It was launched at the same time v7 shipped, so to deny a link would be ludicrous, but it has recently occurred to me that it might be useful just to cover some of the basics. I am basing this on a large number of conversations I have had over the last month or so where I have had to explain the basics to people who had not quite got one of the concepts. So here I go:

What is a bus?

A bus is a number of things, which makes coming up with a single definition hard, but when we talk about a bus we mean all of the following:
  1. A name space for destinations
  2. A cloud to which destinations are defined and client applications connect.
  3. A set of interconnected application servers and/or clusters that co-operate to provide messaging function
  4. A set of interconnected messaging engines that co-operate to provide messaging function
While some of these might seem similar they are in fact different and that difference should become clear later on (each statement is not quite equal).

What is a destination?

A destination is a point of addressibility within the bus. Messages are sent to and received from destinations. There are a number of different types of destination. These are:
  1. A queue. This provides point to point messaging capabilities. A message is delivered to exactly one connected client and the messages are broadly processed in a first in first out order (message priority can affect this order).
  2. A topic space. This provides publish/subscribe messaging capabilities. A message is delivered to all matching connected clients.
There are other types of destinations, but they are less common, so I have skimmed over those.

What is a messaging engine?

A bus is a logical entity and as such provides no value on its own. The "runtime" of the bus is provided by a set of messaging engines which co-operate to provide this runtime. Messaging engines provide two important functions. The first is that clients connect to messaging engines, and the second is that messages are managed by the messaging engine.

What is a bus member?

While messaging engines provide the runtime they are not directly configurable (except for one key exception I will cover later). Instead servers and/or clusters are added to the bus. When a server or cluster is added to the bus it causes a single messaging engine to be created. A server bus member can host at most one messaging engine per bus, a cluster can have multiple, which is the only time you can create messaging engines.

Destinations are then "assigned" to a bus member at which point the messaging engines running on those bus members get something called a message point which is where the messages are stored.

Multiple servers and clusters can be added to a single bus. This is an important point. Some discussions I have had recently point to this being a point of confusion. Two different servers or clusters can be added to to the same bus. A bus can be as large as the cell in which it is defined. It can be larger than a single cluster.

How does High Availability work?

A certain level of availability is provided just by adding multiple application servers as a bus, a client can connect into any running messaging engines. The problem is that if messaging engine is not running the message points it is managing are not available. This does not provide an ideal HA story.

If you want HA you just use a cluster as a bus member instead. When you add a cluster as a bus member you get one messaging engine which can run on any application server in that cluster. If the server in the cluster fails then the messaging engine will be started in another server in the cluster. This can be configured using a policy.

How does Scalability work?

Scalability also utilizes application server clusters. By configuring multiple messaging engines in a cluster each messaging engine in the cluster will have a message point for destinations the cluster manages. We call this a partitioned destination. This is because each messaging engine only knows about a subset of the messages on the destination.

The upshot of all this is that the work load is shared by multiple servers.

And finally


So there we have it. The infocenter does cover a lot of this in more detail. I have linked the titles to the appropriate part of the infocenter for learning about the topics.

If you have any questions feel free to ask in the comments.
Alasdair