The changes to Daylight Savings Time (DST) in the United States this year raised questions about the way time is handled by various software programs.
The implications of the DST change for WebSphere MQ are fairly straightforward. In summary, there are no issues with the WMQ runtime (on either the servers or clients) as WebSphere MQ uses UTC and is essentially oblivious to DST. There are some issues surrounding the Java runtime environments supplied with WebSphere MQ. This is all well documented in a TechNote on IBM.com.
Despite this, the DST changes this year were a good reminder of the implications of making changes to the system clock on servers. In this post I will discuss some implications with WebSphere MQ.
Duplicate MsgIds, CorrelIds, GroupIds, ConnIds
WebSphere MQ generates unique identifiers for a number of values, such as the unique message identifier, MsgId. Although they aren’t straightforward timestamps, part of the value is generated based upon a timestamp in UTC. As such, changes to the system clock does make it possible that the queue manager will generate values for these identifiers which have been previously used.
This should not cause any errors within the queue manager. WebSphere MQ allows applications to generate identifiers externally to the queue manager, which therefore may reuse identifiers (although this is something that we strongly discourage). As a result, duplicate identifiers have never been something that we could absolutely prevent, and therefore are generally able to handle.
It is possible that duplicate identifiers may cause problems in the logic of some WebSphere MQ applications which rely on these IDs. For example, it may cause applications to get an incorrect reply message, or for message groups to contain incorrect members, or message sequence numbers to be reused. How significant a problem this will be is entirely dependent on the design of the WebSphere MQ application, and how it handles these values.
WebSphere MQ Publish/Subscribe stores information on SYSTEM.BROKER queues, and uses MsgIds to retrieve it. As such, it is possible that duplicate message identifiers could cause errors in the WebSphere MQ Publish/Subscribe Broker.
Expiring messages
Consider a message which has a specified time-to-live after which it becomes eligible to be discarded (if it has not already been got from the destination queue). In these cases, the queue manager stores the time that the message arrives, and uses this to perform comparisons with the current time to identify if a message should expire.
If the system clock changes after such a message is put, then it is possible that messages may be expired too soon, or not expire when intended. This may cause a problem for applications which rely on message expiry.
With persistent messages, the stored arrival time will be written to disk and will be restored after the restart of a queue manager. Restarting a queue manager after changing the system time will therefore not resolve any such problems.
MQGET with WAIT
Consider a WebSphere MQ application which performs an MQGET and specifies that it wants to wait for a message for a period of time before timing-out. If the system clock changes after the MQGET is issued, but before a message is got or the time-out occurs, then the time change may cause some applications to remain in the MQGET call for longer or shorter than was intended.
This is unlikely to cause a significant problem in most instances, however this is again dependent upon the application design. Restarting the queue manager would ensure that any possible issues for this particular problem are prevented.
It is worth noting that waits will generally last for the requested interval regardless of time changes. WMQ typically requests notification from the operating system after a specified interval rather than at a specific end-time. As a result, WMQ’s behaviour in this regard will ultimately be dependent upon different operating systems’ handling of intervals in the event of time changes. However, this behaviour cannot be guaranteed and should not be relied upon.
Trigger intervals
TriggerInterval is a queue manager attribute used to restrict the number of trigger messages. It is intended to allow for a queue server that ends before processing all the messages on the queue. The purpose of the trigger interval is to reduce the number of duplicate trigger messages that are generated.
Changes to the system clock during triggering may cause the trigger interval to be generated too early or too late. Whether this causes a problem depends on how significantly TriggerInterval is being relied upon in a given environment.
The trigger interval is reset when a queue manager is restarted, so restarting the queue manager after the system time is changed will avoid any possibility for problems in this area.
Batch heartbeat
Batch heartbeats allow sender channels to determine whether the remote channel instance is still active before going indoubt.
Changes to the system clock could cause an extra heartbeat to be generated or for one to be missed. It is unlikely that this would cause a problem, however restarting affected channels should be sufficient to resolve any issues that arise.
Batch interval
Batch intervals are used to specify a time after which a batch should be committed even if it has not reached the specified size.
Changing the system clock during an in-flight batch could cause a batch to be submitted earlier (if the clock is moved forwards) or later (if the clock is moved backwards) than intended.
Whether this causes a significant problem will depend on how much the batch interval is relied upon to ensure that batches are committed. For example, in an environment where BatchSize is set so high that it is never reached, then moving the system clock backwards a long way could result in a notable wait before messages are committed. Typically, however, most customers have BatchSize set to a value that is sufficient to avoid this.
This value will be reset when a channel is restarted, so restarting affected channels should be sufficient to resolve any issues that arise.
Queue service intervals
Queue service interval events indicate whether a queue was ’serviced’ within a user-defined time interval. Changing the system time could affect this function - such as causing the queue manager to generate unnecessary queue service interval events if the clock is moved forwards (far enough to cause it to mistakenly think that the queue has not been serviced for too long), or to fail to generate an event if the clock is moved backwards.
Whether this causes a problem will depend on how these events are being handled and used, however it is likely that unexpected queue service interval events will be relatively easy to match up with a server time change.
Other monitoring
WebSphere MQ v6 introduced the collection of a number of new statistics fields, such as QTIME - used to indicate the length of time that messages are staying on a queue. The queue manager reports these values based upon differences between timestamps. As such, in the event that the system time is changed, it is possible that these monitoring values may no longer be reliable.
These values are provided to aid user administration and problem diagnosis, and are not relied upon by the queue manager. As such, it is unlikely that this would cause any significant problems.
User-visible timestamps
There are places within WebSphere MQ where timestamps are collected for displaying to the user, such as the creation and last-alteration date of queue managers and queue manager objects. These values are not used by the queue manager for processing, so should not cause any problems other than potentially misleading or confusing a system administrator with values which may appear to be incorrect.
Avoiding problems in the first place
The best practice is to avoid any changes to the system clock in the first place. Once a system has become confused because of changes to the time, there may be no clear way to resolve the situation (e.g. to identify what how messages with duplicate message identifiers should have been handled).
Where possible, changes to timezones are preferable to changes to the underlying system clock - as these are not subject to the problems outlined in my post. This is because the WMQ runtime uses UTC rather than local time, as mentioned in my introduction.
If a change to the system clock is unavoidable, a good precaution against the risk of duplicate identifiers mentioned above is to end the queue manager during the period when time will be “repeated” on the server. For example, if you need to move the clock back an hour, then end the queue manager for an hour. In this way, the queue manager should not experience any duplicate timestamps.
Using NTP
It should be possible to run WebSphere MQ on servers using NTP (Network Time Protocol) to keep time synchronized across multiple machines.
The potential for generating duplicate identifiers discussed above can be lessened through the configuration of NTP to favour slewing rather than stepping the system clock. In this way, if the system clock is ahead of the correct time, it will be slowed down to allow the correct time to “catch up” rather than stepping the clock back to the correct time immediately. In this way, queue managers on that server may be less likely to encounter duplicate timestamps.
NTP adjustments are typically minor enough that the potential for the other time difference/interval-related issues discussed here are not noticeable.
Implications for non-distributed platforms
Please note that I am referring to distributed platforms in the discussion in this post.
It is worth highlighting that, unlike distributed platforms, there are in fact specific problems when making time changes on iSeries systems running WebSphere MQ due to the use of the journal. This is explained in the WebSphere MQ System Administration Guide for iSeries.
Summary
In this post, I have outlined a number of implications of changes to the time on a system with WebSphere MQ running. This should not be taken as a definitive list of all possible implications, and there may be other issues that I have not considered.
With one exception (potential problems in WebSphere MQ Publish/Subscribe in the event of duplicate MsgId values), none of these implications are issues which would cause errors or problems in the queue manager operation. And most of the issues can be resolved by restarting the queue manager or channels in use.
The implications outlined generally issues which may cause confusion in the logic of applications connecting to WebSphere MQ if the unexpected behaviour is not handled correctly. For example, a change to system time which causes messages to be got later than intended is not going to cause errors within the queue manager, but may cause problems for the application concerned.
As such, it is worth considering the design of your WebSphere MQ applications in the light of such implications if significant time changes are required on production servers.

3 comments
Comments feed for this article
August 22, 2007 at 4:31 am
links for 2007-08-22 « betaalfa
[...] What happens to WebSphere MQ if the time changes? « a Hursley view on WebSphere MQ (tags: ibm websphere wmq time date) [...]
August 22, 2007 at 4:39 am
Glenn
A very good post.
The UTC PutDate/PutTime is visible in the Message Descriptor and applications have been known to take these values to record timing information or calculate intervals. If the UTC system clock changes it can cause problems for such applications. I always recommend that applications include date/time stamps in the message data so that issues with UTC, time zone and DST changes are internalised to the application.
Some customer’s systems have their system clock set to the local time rather than UTC (ie. the time zone offset is zero), therefore the PutDate/PutTime appears to applications on local and remote systems to be the local time and they code reliance on this into their applications. This can cause application problems when the system clock is changed twice a year for DST. The above recommendation also applies in this case.
A big disadvantage of setting the system clock to local time is that at the end of daylight savings the system has to be shut down for a least an hour to avoid duplicate system times upsetting logs, journals etc.
Glenn, IBM Australia
August 22, 2007 at 9:07 am
Dale Lane
@Glenn - Thanks very much
You mention a good precaution that I should have highlighted : that if users cannot avoid moving the system clock backwards and want to protect against the risk of duplicate unique identifiers, then the safest precaution is to shut down the queue manager while the timestamps are repeated.
For example, if you need to move the clock back an hour, then end the queue manager for an hour so that it does not encounter duplicate timestamps.
I’ve updated my post to include this - thanks very much!