How many full repositories should you have in a WebSphere MQ cluster? The Queue Manager Clusters manual says “preferably two” and “Having only two full repositories is sufficient for all but very exceptional circumstances” (see doc).

The reason for having two rather one is for availability reasons (one is a single point of failure). Single point of failure is probably somewhat of an overstatement, as the impact of all full repositories being unavailable is that definitional changes (e.g. define/alter cluster queues or cluster queue receiver channels) cannot be shared with the cluster.

Application messages between available cluster queue managers can still flow when full repositories are not available, so the impact is to cluster object changes rather directly on application traffic.

I recommend having exactly two full repositories per cluster unless you have a very good reason. This is by far the most common cluster configuration. “Good” reasons (judged by users, rather than myself) for not having two, include…

  • It’s a development system, I’m just playing, a single full repository will suffice.
  • In our environment some full repositories are difficult to reach or are regularly deleted (it happens!), having more than two full repositories may be useful.
  • Someone architected is like that years ago, it works, and I don’t want to change it.

So are there any other reasons not to have two full repositories? Yes, including…

  1. The more full repositories you have, the more internal messages will be flowed between them.
  2. Partials repositories only publish to two full repositories anyway.
  3. The more full repositories you have, the less flexible the cluster is to changing the set of full repositories. This is because all full repositories should have manually defined cluster sender channels to all other full repositories in the cluster because “full repositories republish the publications they receive through the manually-defined CLUSSDR channels, which must point to other full repositories in the cluster” (see previous doc link). Full repositories will not republish the publications they receive to other full repositories over auto-defined only channels. So when adding a full repository you should define a cluster sender to every other full repository and also go round all the other full repositories and define a cluster sender to the full repository being added. The more full repositories, the more queue managers you will have to administrate when changing the set of full repositories. This overhead is not associated with adding a partial repository, which simple requires that you define objects on the partial repository being added to the cluster.

It is possible for a cluster to work if the full repositories are not fully interconnected to one another, which avoids the less-flexible-cluster argument, but creates another problem. The problem that arises from not fully interconnecting full repositories is that definitonal changes may not reach other full repositories, especially if channels are down, which will leave a set of cluster queue managers with an out of date view of the cluster. For instance, imagine we have four full repositories, connected by manually defined cluster senders, as follows:

  • QM1 has a cluster sender to QM2
  • QM2 has a cluster sender to QM1 and QM3
  • QM3 has a cluster sender to QM2 and QM4
  • QM4 has a cluster sender to QM3

If you alter a cluster object on QM1 the change is sent via the manually defined cluster sender channels. QM1 sends the change to QM2, which sends it to QM3, which sends it to QM4. In this manner, all full repositories become aware of the change.

So imagine that QM2 is down and you alter a cluster object on QM1. The change cannot be sent to QM2, so QM3 and QM4 (which are both running) do not receive the change either. The change info will sit on the SYSTEM.CLUSTER.TRANSMIT.QUEUE on QM1 until QM2 is once again available, at which point the change will flow to QM2 and then on to QM3 and QM4. Using this model, full repositories are dependant on the availability of themsleves and other queue managers to keep an up-to-date view of the cluster.

And now imagine that all full repositories are fully inter-connected with manually defined cluster sender channels. If QM2 is down and a change is made to a cluster object on QM1, both QM3 and QM4 receive the change (directly from QM1). The change info will sit on the SYSTEM.CLUSTER.TRANSMIT.QUEUE on QM1 until QM2 is once again available, at which point the change will flow to QM2. Using this model, full repositories are only dependant on the availability of themsleves to keep an up-to-date view of the cluster.

In summary, have exactly two full repositories per cluster unless you have a very good reason and fully interconnect them with manually defined cluster sender channels.

Advertisements