Service Broker Monitoring Routine

 

UPDATE:  Please see [[Monitoring Service Broker]] for an updated my updated monitoring routine.  

There are almost no GUI tools for managing Service Broker.  This isn't a problem for me, I rarely use GUI tools, but it is a problem for our support people.  In the previous post I covered my [[Service Broker Setup Routine]].  This routine enabled SB on a given database in a reliable, repeatable manner.  This helped our DBAs immensely.  

The next problem to overcome is how our DBAs can monitor Service Broker and its Queues, Services, etc.  In my experience SB is very "lights out", just like anything else in SQL Server, if done correctly.  But it's difficult to sell this to DBAs who really do not understand what Service Broker is.  So I created a monitoring script to put their minds at ease.  This monitoring routine is generic enough that it can be easily modified to suit any setup or configuration.  There are lots of SB monitoring scripts available on the web.  I've tried to compile them into one simple script that you can run that checks your configuration from the top down.  You can download the script here.  

Let me point out some interesting pieces of code:  

My monitoring procedure is actually a combination of an installation, monitor, and teardown script.  I do this so all logic is contained in one little place.  Here are the options: 

  • SETUP:  sets everything up.  It does a "properties-based" setup.  This means that SETUP can be rerun multiple times without throwing an error and will ensure that the net result of each run is that we have a running system.  
  • TEARDOWN:  the opposite of SETUP.  This is useful for unit testing the SETUP process.  
  • TEARDOWN FORCE OVERRIDE:  TEARDOWN will *not* do anything if there are unprocessed items in any queue, items stuck in the transmission q, etc.  This override option tears everything down regardless of state.  YOU WILL LOSE DATA using this option.  
  • CHECK:  this is the simplest monitoring tool, it ensures all Qs, metadata, and DMVs are showing everything as being good.  I'll cover this in-depth below.  
  • BROKER HELP:  This checks low level components of Service Broker.  I'll cover this in-depth below.  
  • TRACER TOKEN:  this works just like a tracer token in replication.  It sends a test transaction to our Svc and makes sure it properly reaches its destination.  
  • STALLED MESSAGE CLEANOUT:  I'll cover this below.  
This is part of BROKER HELP.  I am echoing the status of various SB components by looking at DMVs.  There is far more in the script.  
Here we are looking at PerfMon statistics for various SB-related counters.  
Activator procedure errors are echo'd to the SQL Error Log, so we display the last hour of that data.  We also build a SSBDIAGNOSE command for each service and then interrogate it for errors.  
STALLED MESSAGE CLEANOUT runs this code.  We are looking for conversations that are not closed and END them WITH CLEANUP.  

 

Other Things I Monitor

This is all in the script:  

  • Ensure SB is enabled in the db
  • We look for DROPPED queue monitors on activated queues
  • We look for queues in the NOTIFIED state for more than 10 seconds.  This could mean a really busy queue, or it could mean a problem with an activator throwing errors.  
  • Activator errors (we use PerfMon for this)
  • activated queues in a DISABLED state
  • "Poison Message" detection (queues that are no longer is_receive_enabled)
  • Conversation Population Explosion...check for an excessive amount of conversations not in a CLOSED state.  This may mean we are not issuing a CLOSE CONVERSATION correctly in our setup.  Or we have a "fire and forget" pattern, which is not right.  
  • conversations "stuck" in sys.transmission_queue

I hope you find this useful