Thank you all for responding to a previous question that I had, I greatly appreciate the help.
I do have another strange situation that I could use some assistance with.
One of the environments that I manage has a monitoring server that has an event generator set up that checks the connection to all of the servers (and to the DDM.nsf replica on each of the servers) in the network every 3 minutes. If that connection attempt times out, the event generator is configured to send out an alert email. I should mention that the servers are in different domains, but the monitoring server has valid connection documents to all of them.
The problem that I am noticing is that when I bring down one of the servers, the event generator does not send out the alert email. By watching the JConsole on the monitoring server, I can see that the event generator checks and notices the failed connection, but it doesn't follow up by sending out the message like it should.
Now, here is where it gets weird. If I forcibly restart the event monitor task on the monitoring server, it then sends out that alert about the failed connection....and this could be hours from when the actual event actually happened. It is as if the event generator and/or the event handler message is somehow getting stuck, and only when I restart the event monitor task, does it then release the message that it was supposed to send out when the event first happened.
Now, for some reason, this doesn't happen with every server that the generator is set up to watch. The event generator number is the same, we just added every server that it needs to check to that one event generator (it's only about 16 servers). And, the events4 and DDM databases are all replicas across the environment (which are all properly replicating), so I cannot figure out what is causing this message lockup to happen.
The other weird issue that I have noticed (and I'm not sure if it is related to the event monitor issue or not) is that the monitoring server is constantly showing a message in the logs that states "Suppression interval = 0", and it seems to throw 3-4 of them in a row right when it should be sending out one of those event notifications. I cannot find anything anywhere that references this specific message, and it does not point to any specific ini setting that is set up on the server. I say this because none of the settings on the monitoring server have anything to do with any type of suppression interval.
If anyone has any insights to this problem, or the console message, please let me know. I have tried contacting IBM's support about this, but that was a lesson in futility.
Thank you all very much.