slideshow 1 slideshow 2 slideshow 3 slideshow 4 slideshow 5

Welcome

You've found davewentzel.com.  I am an independent consultant and principle with David Wentzel LLC, focusing on anything related to data.  Here you'll find a collection of my thoughts, mostly on data architecture and SQL Server.  I host a blog as well as a Features section with searchable content.  I also showcase a Portfolio of open source projects I am working on, especially Drupal integration.  I host some family media to showcase some of those skills. Lastly, I have done quite a few presentations on SQL Server topics that you may be interested in.  

Data is one of your company's most valuable assets.  We can help you most the most of it.  

Service Broker Demystified - Why is there no ALTER CONTRACT statement?

People tell me all the time that they don't want to use Service Broker because it is too confusing.  I started a blog series called Service Broker Demystified because SB really isn't that tough if you don't get lost in the weeds.  Today we are going to cover how you alter a Service Broker (SB) contract. 

Remember from the last post that a SB contract merely lists which services are allowed to use which message types.  Message types and contracts are both totally optional and your SB implementation will run fine without them.  When I'm first learning a new technology I try to avoid overloading my brain with optional features to limit potential confusion.  

A contract can specify multiple message types and who may use them (valid values are ANY, INITIATOR, and TARGET).  Let's start with a little demo.  If you want to follow along the full source code is available here.  Let's say that we've been tasked with designing an asynchronous inventory management system with SB.  Here are some message types that we may decide to create.  Hopefully the names are sufficient to understand the intent of the system.  

And here is the associated contract that handles are inventory workflow (very rudimentary):

\

Initially the workflow's intention was to have either the inventory requestor or the inventory system be able to cancel an inventory request.  But let's say you decide later that only the sender can cancel an inventory request...the inventory system can never cancel a request, rather it can only delay it. In 
this case you must drop and recreate the CONTRACT...you can't alter it.  This command clearly will fail:

Instead we must DROP and reCREATE the CONTRACT.  Something like this:  

Note that we have changed the contract such that the INITIATOR is the only party that can post messages of type CancelInventoryItemRequest.  

Why is there no ALTER CONTRACT?

In the real world you don't alter a legal contract.  You generally make a new version with a new set of signatures.  Certainly there are exceptions but contracts aren't meant to be broken.  It's no different in the Service Broker world.  CONTRACTs are not meant to be broken.  Contracts are immutable.  However, situations such as my inventory example are commonplace.  When requirements, and hence CONTRACTs, must change BEFORE the system is released into the wild it is certainly reasonable to simply DROP/CREATE the contract like the above screenshot.  

However, this is frowned upon once other systems begin to send messages to your service using that contract.  Altering the contract could break other systems that are relying on its functionality.  Since there may be a lot of moving parts and different systems with different release schedules you can't guarantee that everyone will be able to alter their functionality due to the change in your contract.  Instead you should consider versioning your contracts and supporting multiple versions of a contract for a certain amount of time.  A simple naming convention like CancelInventoryItemRequest_v1 is usually sufficient.  

So if we should version our CONTRACTs, why not our SERVICEs...or...why is there an ALTER SERVICE command?

Versioning your SERVICEs is probably a best practice as well, but the fact is SQL Server does allow us an ALTER SERVICE command.  Why?  

  • It may be necessary to temporarily alter a service to point to a different queue with different activation semantics.  
  • you can always ADD CONTRACT to an existing SERVICE...meaning that your service can adjust to support new features without service interruption.  
  • at a later date you can also ALTER SERVICE...DROP CONTRACT to deprecate the old functionality.  

Summary

So based on how Microsoft architected services, contracts, and message types you can see that serious thought was given to versioning the critical objects to help you ensure no service disruptions when there are changes to your design.  If you understand why MS has given you these options you start to understand better how Service Broker really works.  

 


You have just read "Service Broker Demystified - Why is there no ALTER CONTRACT statement?" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Service Broker Demystified - Contracts and Message Types

People tell me all the time that they don't want to use Service Broker because it is too confusing.  There are too many moving parts.  What's the difference between a service and a queue?  How do I monitor Service Broker?  Where's the GUI?  I started a blog series called Service Broker Demystified because SB really isn't that tough if you don't get lost in the weeds.  

I was helping a customer with a new Service Broker implementation of my tickling pattern and I briefly said, "We'll worry about the Message Types and Contracts later...they really aren't required."  The customer immediately stopped me and wanted to know why.  

Many SB experts will disagree with me and state that contracts and message types are not optional...but they are.  You don't need either for any Service Broker implementation.  However, their use will help you avoid problems later.  I consider contracts and message types to be the "foreign keys" and "check constraints" of the Service Broker world.  You don't need FK and check constraints...but when you goof up your code you'll be glad you had them. 

What is a Message Type?

When you send a message to a service by default it will have no message type.  It's up to you, therefore, to "do the needful" with a given message.  Maybe you have a tickler where the message is merely asking the target service to "wake up" and do something.  In this case no message type is needed.  However, for clarity I would still create a message type and call it Tickler with VALIDATION = EMPTY...meaning I'm expecting it to have no payload.  

Perhaps your service is simple enough that you only ever expect one type of message with a little bit of XML to traverse from service to service.  You are fine with not validating your XML...you are a trusting soul.  Again, no message type is really needed.  But if you believe in "sanitizing your sql inputs" then you really do want to validate your XML.  In that case, you need a message type that is either WELL_FORMED_XML or, even better in most cases, VALID_XML with SCHEMA COLLECTION.  

So, a message type is a lot like various constraints on a table.  

Type of "constraint" Message Type Equivalent Notes
no constraint on a NVARCHAR(MAX) column NONE With NONE you can do whatever you want with the Message Type payload.  Just like you can do just about anything with an NVARCHAR(MAX) column
CHECK and DEFAULT on a bit column EMPTY EMPTY means the message payload must not contain anything.  This is kinda/sorta like a NOT NULL specification on a bit column with a DEFAULT of 1 and CHECK CONSTRAINT of 1.  The column will be automatically set to the only possible value of 1.  EMPTY is used to flag that an event occurred.  The target service will have defined workflows that will respond to that event.  
a datatype declaration on a column WELL_FORMED_XML When you declare a datatype on a column you are specifying that only certain valid data will be allowed.  In this case only text that can be converted to the xml datatype will be accepted.  
CHECK constraint to enforce a mask on a SSN col VALID_XML WITH SCHEMA COLLECTION This states that the XML must be in a given format, much like a SSN is in the format "123-45-6789".  

A CHECK constraint limits the allowable values in a column to a defined domain.  A Message Type does the same thing for your XML message payloads.  As with a CHECK constraint, a message type is not needed, but you'll find you have better quality data, and less support issues, when you have them properly implemented.  

Whenever I design a new SB implementation I don't concern myself with the message types in the beginning.  I want to get a prototype up and running quickly.  After I have some stability I then add in these constraints.  

What is a Contract?

A contract is a binding agreement between a message type and who can utilize it.  If you don't have message types then you don't have contracts.  Period.  So again, I save contract implementation for a point in the project when I have a stable design.  

Service Broker conversations are always dialogs (that's why the syntax is BEGIN DIALOG CONVERSATION).  A dialogue in the English language, according to google, is a conversation between two or more people.  In SQL Server it is a little more strict...a dialog is EXACTLY two parties...an Initiator and a Target.  You can model monologs in Service Broker just like in other messaging systems, but it requires some extra effort and I'll cover that in a future blog post.  (If you can't wait for that post...for a monolog in SB you really need to have an activator on the initiator side that is ENDing CONVERSATION).  Again, we'll cover this in a future post since it is confusing and most noobs model their SB implementations as a series of monolog workflows, not dialogs (just my opinion).  

We got off subject...a contract merely says who can SEND the given message type.  Let's say you have SB services called EmployeeSvc and AccountingSvc.  Let's say you have a message type of ExpenseReportSubmission.  The goal is to allow the EmployeeSvc to submit an expense report to the AccountingSvc.  Let's model this as simply as possible with NO contracts or message types.  You can follow along by downloading the repro script.  

Here is the most basic setup...there are NO contracts or message types defined.  The [DEFAULT] contract is specified and the "contract clause" is necessary whenever you want to create a Service capable of receiving messages.  See my last post for some clarification on that bit of confusion.   If we don't at least provide the [DEFAULT] contract we'll get this error in the sending queue:  

  <Code>-8408</Code>
  <Description>Target service 'AccountingSvc' does not support contract 'DEFAULT'.</Description>
</Error>
 
Note on Line 35 that our message payload is empty.  That's fine because the [DEFAULT] contract on the receiving service is set to 

VALIDATION=NONE.  Hence we get no error and the message is properly enqueued in the AccountingQ with a NULL message_body.  
 
Of course this design is silly...you certainly wouldn't want to allow NULL messages in an expense submission workflow...but you can when you don't properly declare your "constraints".  You can send just about any message and the AccountingSvc will accept it without validating it.  Here is an expense submission using simple text and note that it too does not fail.  

At this point it would be the responsibility of whatever program or activator is processing the AccountingQ to perform any message validation and send any errors back to the sending service.  But it's probably better design to implement those constraints as message types and contracts.  We do that by creating a MESSAGE TYPE that specifies that we will only accept WLL_FORMED_XML.  We can of course be even more precise, but this should be sufficient to illustrate the point.  We then create a contract that specifies that our new message type can only be sent by the initiator.  Finally we bind the contract to the existing service.  

Let's run a little test.  This time we are going to send some XML (Line 81) and ensure that our message made it to the target service.  Note that the previous two messages, prior to us adding the "constraints" are still available to be processed.  

Let's see what happens when we attempt to send some garbage now that we have proper "constraints"

What?  That actually did succeed!  Why?  Because we did not specify the contract clause of the BEGIN DIALOG statement.  Look around Line 97.  The ON CONTRACT statement is missing. Note that we did include that on Line 78 in the previous screenshot above.  

So, you are probably asking yourself, even with a proper constraint at the target service we can still send garbage?  What's the point of contracts and message types?  With a database constraint the constraint is enforced regardless of whether you specifically ask it to be enforced (yes, I know you can disable a constraint and force it to go untrusted).  

The problem is easily explainable.  Although we bound a contract to our service we did not "unbind" the DEFAULT contract...so it is still valid.  We can query the SSB metadata to prove this:
 
So, we need to specifically DROP the [DEFAULT] contract.  Let's do that and retest.  
  <Code>-8408</Code>
  <Description>Target service 'AccountingSvc' does not support contract 'DEFAULT'.</Description>
</Error>

 

So now the error is being thrown correctly and we can no longer use the [DEFAULT] contract.  

Here's one place where the "constraint" analogy for contracts and message types breaks down.  Note that the message did not ERROR for the sender, rather it was simply put in sender's Q as an error, as well as being noted as an error in sys.transmission_queue.  This is another case where SSB confuses noobs.  As long as your BEGIN DIALOG syntax is correct you can always send a message without the sender seeing the error.  Instead, the error causes the message to be retained in the sender's queue and sys.transmission_queue.  
 
Summary
We covered a lot of ground today around Contracts and Message Types.  Both can be confusing until you remember that they are optional Service Broker objects.  Everything will work just fine without them, but will work better with them.  

 


You have just read "Service Broker Demystified - Contracts and Message Types" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Service Broker Demystified - [DEFAULT] is not the DEFAULT

People tell me all the time that they don't want to use Service Broker because it is too confusing.  I started a blog series called Service Broker Demystified to dispel these misconceptions.  Today we are going to cover the [DEFAULT] objects, which are not really default objects...sometimes. 

Quick review...a contract binds a message type (defined as the name of a message and how it is validated) to who may send it during a dialog.  A service is defined as a set of tasks associated to a queue with a defined contract.  Looking at the syntax of each SSB object might make it easier to comprehend.  

Message types and contracts are always optional SSB objects...we'll cover why that is in the next post.  But you should also note in the syntax above that the word DEFAULT is placed in square brackets.  That is because DEFAULT not a keyword in SSB, it is the actual name of objects that are, by default (...couldn't resist), shipped by Microsoft.  

I ran the follow queries on a database without SSB enabled.  Here is the repro code if you want to try for yourself:

So, the syntax for CREATE MESSAGE TYPE and CONTRACT indicates that there are message types and contracts named DEFAULT...and we confirmed that by querying the SSB metadata.  

But what is the DEFAULT...what does it mean?

  • The DEFAULT message type, as you can see in the screenshot above means there is no validation done.  By the way, this actually *is* the default behavior for a message type that did not specify the optional VALIDATION clause (see the screenshot above).  
  • The DEFAULT contract uses the DEFAULT message type (meaning there is no validation done).  However, in this usage the [DEFAULT] object is not the default and MUST BE SPECIFIED to actually be used.  If you don't believe me just look at the syntax again. Optional syntax is in square brackets and the message_type_name clause is NOT in square brackets.  

Let me repeat, the fact that these objects are named DEFAULT is a misnomer...they are not actually the defaults.  This leads to confusion and errors for the uninitiated.  Let's create the simplest possible SSB solution to illustrate this fact...a simple employee expense submission.  Here's the code to follow along:  

Here we avoid all of this DEFAULT confusion by simply not using contracts and message types...they are optional anyway (see next post).  We create two services and queues...the EmployeeSvc will enqueue a message to the AccountingSvc.  DEFAULT is nowhere to be found in the code.  On Line 33 we start the code to send a message.  Note on Line 38 that we turn off encryption to avoid that headache (see the last post).  Note that the message has no payload (Line 40).  

This seems to me that it should work.  But it doesn't.  

The actual transmission error is "Service Broker received an error message on this conversation. Service Broker will not transmit the message; it will be held until the application ends the conversation."  The message is "stuck" in the sending Q (EmployeeQ).  The error embedded in the XML is "Target service 'AccountingSvc' does not support contract 'DEFAULT'.".  

So, there you have it...we did not specify the [DEFAULT] contract, therefore we cannot use the [DEFAULT] contract and it generates an error.  If you haven't fallen asleep yet you are probably thinking back to the syntax for the CREATE SERVICE command (screenshot to the right).  The "contract clause" is indeed listed as optional...and we were able to create the AccountingSvc without it...so what is going on?  

When you create a service and do not specify the "contract clause" then the resultant service is a "send only" service.  It cannot receive messages.  This is because a sending service (or any service really) may potentially send to many different services each with its own contracts and message types.  So we can't really restrict a sending service at all.  All validation occurs on the receiving service.  So when we created our two services we really created two sending services.  SQL Server assumes we knew what we were doing.  

So, the next question you probably have is how do we get our SSB design fixed and working as desired.  All we really need to do is, minimally, declare that the [DEFAULT] contract should be used for the AccountingSvc, thus making it a target, with the [DEFAULT] validation, which is validation NONE.  Let's see if that works.  You can download the code to try it yourself:    

It does work.  So merely adding ([DEFAULT]) was enough to allow the Svc to receive messages.  

Of course this isn't the most ideal setup since [DEFAULT] implies that there is NO VALIDATION.  This means we can send just about anything and AccountingSvc will accept it.  

The first message is the "no payload" message.  The second is some meaningless text.  

Summary

So the takeaway is that the [DEFAULT] contract is not really the default contract for a service.  You must still declare it.  If you don't then you created a send-only service.  In the next post we'll cover why contracts and message types are optional, which builds on the concepts from this post.  

 


You have just read "Service Broker Demystified - [DEFAULT] is not the DEFAULT" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Service Broker Demystified - SET ENABLE_BROKER vs NEW_BROKER

Do you think Service Broker is confusing?  If so, I started a blog series called Service Broker Demystified because SSB really isn't that difficult.  Today we are going to cover what happens when you restore a database that contains Service Broker objects.  When you run ALTER DATABASE SET NEW_BROKER, in other words enable SB for the first time...SQL Server creates a service_broker_guid for your database as well as a flag that indicates Service Broker is enabled.  You can also configure your SQL Server to have all new databases automatically have SSB enabled.  

In the screenshot you'll see that I have a number of databases that have SSB GUIDs, but SSB is not enabled on any of them.  

First, let's cover what the GUID is for.  There is no requirement that the service_broker_guid be unique on a given instance.  This is easy to prove.  I have a repro script that you can download and try for yourself.  Or you can just follow along with the screenshots.  

First I create a database called SBTestOriginal and enable SSB.  Then I verify that SSB is enabled and determine the GUID.  We take a backup and restore it on the same instance as a copy with a new name of SBTestNew.  

The process of restoring the database did NOT change the GUID...but it did disable SSB.  So, we'll need to fix it if we want to use our services and queues in that database.  

Let's assume your SSB infrastructure runs totally intra-database, which is the case for most people.  After restoring a db you probably just want to run ENABLE_BROKER.  But you can't...

The reason is that the "assigned GUID" for the copied db is the same as an existing GUID on the instance.  So, the correct process is to SET NEW_BROKER.  

Note that now SSB is "reset" and we have a brand new GUID.  

But this raises another question...if I am forced to run SET NEW_BROKER in a restored db, what happens to "in-flight" messages in existing queues?  Are they destroyed or do they restart themselves automatically once SSB is again enabled?  Before we run a test to see the behavior let's think about what should happen.  When you ENABLE_BROKER then everything should just pick up where it left off.  But NEW_BROKER implies that everything should be cleaned up.  This makes sense from a safety perspective too.  If you restore your prod db to a dev system you want your in-flight messages to be destroyed so you don't accidentally continue processing them on a dev instance.  Let's see if I'm right:  

We are going to take the existing example and merely create a test service and queue and then enqueue a message before we take the database backup and restore it. 

Note that our message was successfully enqueued and is waiting for processing.  Now we backup and restore the db as a COPY to the same instance and we SET NEW_BROKER.  And then we run some code to see if our message is still awaiting processing in the new db:  

The contents of the queue are destroyed when we SET NEW_BROKER, as expected.  Nothing in the queue and nothing "stuck" in sys.transmission_queue.  Note also that are queue is ready for processing.  The message is gone.  It isn't logged anywhere, it is just destroyed.  

In general I think this behavior is the desired behavior in most use cases.  But this behavior is important to understand.  Depending on your messaging infrastructure and design you may need to apply some "compensating" logic on your queues before you simply SET NEW_BROKER.  For instance, it is common to design workflows in SSB that entail many messages going back-and-forth between services.  In those cases you may have transactions in a goofy state when you re-enable SSB.  You need to think about this and work with your DBAs to determine what the correct process should be when you restore databases with SSB enabled.  

This doesn't necessarily mean manual intervention by a DBA.  I'm a strong advocate of automation and I've always used a "SSB enable" procedure that I tell my DBAs to run WHENEVER they restore a db.  A script like this should be responsible for

  1. if needed, handling compensating logic BEFORE SET NEW_BROKER is run.  Once it is run any old messages are gone forever.  
  2. smartly determining if ENABLE_BROKER can be run, or if NEW_BROKER is required.  
  3. the db must be in SINGLE_USER mode (or equivalent) when these commands are run.  
  4. you may also need to set TRUSTWORTHY flags and deal with encrypted database master keys, depending on your requirements.  

You can download the script I use to enable Service Broker on a restored db.  It doesn't handle every scenario above, but it gives you a framework to plug in values specific for your needs.  

Whether you are a DBA or a SSB designer you need to be aware of what happens when you restore your database.  It can be a bit tricky and you don't want to lose important message data.  


You have just read "Service Broker Demystified - SET ENABLE_BROKER vs NEW_BROKER" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Service Broker Demystified - Encryption Weirdness

Do you find Service Broker confusing?  I started a blog series called Service Broker Demystified because SB really isn't that tough if you don't get lost in the weeds.  Today we are going to cover the first thing that trips up noobs...issues with encryption.  

Service Broker requires communications to be encrypted by default.  This makes perfect sense to even the newest noob.  When transmitting data over-the-wire it should always be encrypted.  Of course if I'm communicating with other services I should encrypt those communications.  However, most noobs model their first SB designs as simple intra-database/intra-instance asynchronous workflows.  For example, most people first learn of SB when they want to model an asychronous trigger to do some near real-time ETL processing.  In this case no data is communicating over-the-wire so why should encryption be needed?

Before answering that question, let's see the problem in action.  You can download the repro script to play along at home.:

We first create a database and SET NEW_BROKER.  We then build two services and two queues, one with be the Initiator and the other the Target.

If you are experienced with SB you can probably just look at code and, knowing the setup, realize that it will generate an error.  But noobs will look at this and think that this looks like valid code...and it is, but it will still error.  We can query the queue and see the message failed and then look in sys.transmission_queue and see the error:

"The session keys for this conversation could not be created or accessed. The database master key is required for this operation."

So, in the case of a simple intra-database/intra-instance SB design, you still need to setup encryption.  

The fix is to run code similar to the following:  


IF EXISTS(
  
SELECT *
  
FROM sys.databases
  
WHERE name = DB_NAME()
   AND
is_master_key_encrypted_by_server = 0)
BEGIN
   CREATE MASTER KEY ENCRYPTION BY
PASSWORD = '8uY75%jj5k5#%#'
END

This creates an encrypted database master key.  If you run the demo again you should see the message properly enqueued.  

A Workaround

There is a workaround that's a bit easier.  You can download the repro script here.  

All I did was add WITH ENCRYPTION = OFF.  

There may be other workarounds but this is the workaround I use when I don't have the permissions available to create encrypted database master keys and I know my SB design will ALWAYS be intra-db/intra-instance.  

In the case where no data is communicating over-the-wire, why is encryption required?

This starts to get REALLY confusing.  Here are some rules:  

  • the default ENCRYPTION option is ON which I think is wise...better to be safe than sorry.  The side-effect is you have to deal with the encrypted database master key errors.  
  • If the two services are intra-db then communications are NEVER encrypted, regardless of the default setting.  Why?  Because the "communication" occurs as an IPC (in-memory) and no database traffic traverses the wire.  So why is encryption needed in this scenario?...see the previous bullet point...that's the default!!!
  • If ENCRYPTION is OFF but you are communicating to a remote service and the proper certificates are configured, encryption will be used anyway.  So using ENCRYPTION = OFF simply avoids the error message when the proper certificates are not available.  Very wise.  

Hopefully encryption is a little less baffling now.  


You have just read "Service Broker Demystified - Encryption Weirdness" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Service Broker Demystified Series

Hypothetical questions:  

  • What are the three most important features added to SQL Server in the past 15 years (since SQL 7)?  
  • What will be the most important feature of SQL Server in the coming years?  

My answers:

  1. DTS/SSIS (especially the wizards): the ability to extract and load data without writing any "data marshalling" code allowed companies to migrate their old Access applications to SQL Server quite easily. For years I had tons of side work migrating applications from Access or FileMaker to SQL Server. The tool of choice is DTS/SSIS.
  2. The DMVs: I wrote about this recently. The ability to see what the server subsystem is waiting on is the key to understanding the performance characteristics of your application.
  3. Service Broker: The ability to perform asynchronous data eventing is going to be HUGE in the coming years. Also, the ability to let your app do asynchronous data work without the user waiting on an hourglass, and without additional asynchronous client-side coding, is the key to the "perception" of good application performance. Further, I'm convinced that Service Broker is going to be the tool to bridge the divide between SQL Server and NoSQL solutions. Some of the biggest architectural decisions with NoSQL solutions is sharding data and the asynchronous nature of mappers and reducers. Service Broker (SB) will likely be the underlying technology that allows Microsoft to finally give us true sharding in SQL Server. Also, how better to implement a mapper than as an asynchronous SB message to another SQL instance?
Everyone's favorite features list will be different  Perhaps you like the HADR features of the last few releases or even the in-memory table offerings.  To each his own.  We all have our own unique experiences.  This is my list.
 
I feel that not only has Service Broker (SB) been revolutionary in the past 9 years or so, but I obviously feel it is only going to get more important in the future. Anecdotally, SB seems to scare people.  Perhaps the syntax is too esoteric and not very "TSQL-like".  Or the lack of a GUI scares people away.  Or maybe people feel asynchronous eventing is best left to JMS/MSMQ or another tier and not in the db (if so I suggest you google "SODA" which is SOA with Data in the middle).   I've found two issues that scare people the most:  there is a lot of new jargon which are unfamiliar to data people (contracts, messages, services) and people don't understand the little idiosyncracies of SB.  And when you start to understand them then you'll find SB is actually quite intuitive.  
 
This post is the introductory post for a series I'm going to do called Service Broker Demystified.  This isn't a SB tutorial.  You won't learn the basics here.  I assume you know the basics already and just want to understand better some of the esoteric stuff that trips people up with SB.  Here are the topics we'll cover (these links may be bad until I get the posts finalized, check back often):  
 
Service Broker Demystified - Encryption Weirdness covers the first problem most noobs run into...they set up two services and try to communicate and it fails with an encryption error.  Usually the noob will google the problem, fix it, and never really understand why the encryption error is occuring.  In this post I'll try to make it clearer.  
 

Service Broker Demystified - SET ENABLE_BROKER vs NEW_BROKER covers what happens when you restore a database.  Most people have run into this problem but if you haven't, this post is for you.  Regardless, I'll explain the problem, why it occurs, and how I auto-magically fix it.  

Service Broker Demystified - [DEFAULT] is not the DEFAULT covers some goofy behavior when you do and do not specify the [DEFAULT] contract for a service.  

Service Broker Demystified - Contracts and Message Types: people get confused as to what the difference is between a contract and message type. Then they don't understand why both are needed. The fact is, like foreign keys, neither is REQUIRED, but both will help you find bugs and ensure accurate workflows if you use them. Just like FKs help to ensure accurate data.  
 
Service Broker Demystified - Why is there no ALTER CONTRACT statement? There isn't, get over it. DROP and CREATE if you must change a contract. We'll cover WHY it works this way.  This is a good thing.  

Service Broker Demystified - Why do we need Services and Queues? Why do we need both?  There always seems to be a one-to-one relationship with services and queues. So, why did Microsoft give us both?  To just confuse us?  No, there are cases, albeit rarely, when it is helpful to have something other than a 1:1 relationship.  

Service Broker Demystified - Services: How can you determine if a service is an initiator or target? How does the DEFAULT contract change all of that?  Why should we even care if a service is an initiator or target?

Service Broker Demystified - [Initiators] and 'Targets': Why are initiators always placed in identifier brackets and Targets are specified as quoted string literals?  Once you understand this you'll start to see the bigger possibilities of Service Broker.  

Service Broker Demystified - Case-Sensitivity: Some SB objects are case-sensitive. Which ones?  Shouldn't they follow the case sensitivity of my server or database?  No.  Only "exposed" SB objects are case-sensitive.  This post will cover why.  

Service Broker Demystified - CLOSED conversations: yeah, they seem to hang around for awhile. Why is that?  

Service Broker Demystified - Must my SQL Server be wired to the internet? I've heard this question at least 3 times. People think their SQL Servers must have internet connectivity in order to use Message Types. Huh? It's the goofy SB syntax that is confusing them.  No internet access required.  This post will cover this confusing topic.  

Service Broker Demystified - Can I model monologs? Yes you can! Most people actually want to model "fire-and-forget" monologs.  But this is seriously frowned upon.  In this post I'll show you how I do it, safely.  

Service Broker Demystified - How to Radically Simplify SB: Service Broker can be confusing. I've found that most implementations of SB I've worked on followed a handful of design patterns. If you create a "harness" that handles YOUR unique design patterns, you can simplify your life greatly.  I'll show you my harness that has saved me countless hours and headaches.  

Upgrading Vertica

In my blog posts Installing Vertica 7 and Vertica Installation Prereqs I covered installing Vertica 7.0.1 Community Edition.  Since then Vertica 7.0.2 has been released as well as 7.1 EE (Dragline) which has some compelling features like Live Aggregate Projections which are great for "pre-materializing" queries that rely on aggregations (SUM, COUNT, MIN/MAX) that can cause excessive IO under Vertica.  Unfortunately there is no CE edition of 7.1 yet.  Regardless, you can still upgrade to 7.0.2.  
 
I have been tasked with upgrading our 7.0.1 installations to 7.0.2 so I thought I would document the process.  I'm going to give the unabridged version of the upgrade steps, please refer to my previous posts here and here that should fill in the gaps if anything is unclear.  If you are not sure what version of Vertica you are running you can run select version().  
 
Prerequisites
  1. Download Vertica 7.0.2.1 (which is the latest CE as of this writing) from my.vertica.com.  You will need to do this from your laptop.  wget and curl likely won't work.  
  2. Copy the .deb file (assuming Ubuntu) to your sudoer's (root or equivalent) home directory.  Here's an example command using pscp: pscp <location of .deb file> <user>@<ip>:<path on ubuntu>
  3. Take a backup.  
  4. Plan for 2-4x as much catalog space as you currently utilize during an upgrade.  du -h /home/dbadmin/<db>/<node>_catalog/Catalog.  
  5. Find the options you used when you last installed or upgraded Vertica.  nano /opt/vertica/config/admintools.conf.  Search for "install_opts" which will detail the options you used previously.   This is an important step...if you accidentally choose a different set of options when you upgrade then you may find that Vertica stops working due to mismatched components.  Note the install_opts for later.  You do not need to note the -s or --hosts parameters (more on why later).  

Upgrade Process

  1. As dbadmin, stop the database.  One way is the following:  /opt/vertica/bin/admintools --tool stop_db -d VMart -p <password> 
  2. As a sudoer, from any host in the cluster run the new .deb file (Vertica will take care of ensuring it is copied and installed to every other host for you):  sudo dpkg -i /home/dbadmin/<file>.deb 
  3. As a sudoer, run update_vertica with the install_opts noted earlier, substituting the new .deb name for the old one.  You do not need to pass the --hosts or -s parameters...those are automatically determined by the installer based on the current hosts in the cluster, which may differ from the hosts that existed when you last installed/upgraded Vertica.  Here's an example command:  sudo /opt/vertica/sbin/update_vertica --hosts <IPs> --deb /home/<deb file> -L CE -accept-eula
  4. As dbadmin, start your database /opt/vertica/bin/admintools --tool start_db -d VMart -p <password>

Similar to SQL Server, a database is upgraded when it is first started in Vertica...so expect the initial startup to take a bit longer than normal.  Remember, this needs to upgrade the data files and catalog on ALL nodes in your cluster.  At this point you may want to verify that you actually DID upgrade your database...note that my db is now reporting 7.0.2-1.  

Now would be a good time to take another backup.  

Additional Notes

If you encounter errors that you feel you can safely skip then you can always use the --failure-threshold option.  I recently wrote a blog post on this.  

Lastly, you may have to upgrade supporting packages as well, such as MC.  Check your documentation.  

 


You have just read "Upgrading Vertica" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

Workaround: Vertica Error: This version of Debian or Ubuntu is unsupported

I google'd around for hours and couldn't find an answer to this problem so hopefully this post will help someone else.  

If you use Ubuntu and Vertica then you probably know that the only supported release of Ubuntu is 12.04 LTS.  I have no idea why 14 isn't supported or when it will be supported.  This is problematic if your company has standardized on 14 or if you have Puppet scripts that are geared toward 14, you just have an affinity for 14, etc etc. If you try to install/upgrade Vertica on a 14 box you'll see the following error: 

Error:  failed to get system information for all hosts
This version of Debian or Ubuntu is unsupported.  
Hint: additional failures may be hidden.
Installation FAILED with errors.

I've yet to find a good way to get around this on google.  Here are the two methods I've used.  Hint:  use the second method...it's safer.  Do  not use this on a production Vertica installation.  That wouldn't be very wise.  

Use --failure-threshold NONE

If you use this switch then install_vertica (or upgrade_vertica...which calls install_vertica under-the-covers) will not fail for ANY errors.  You'll get a nice list of failures that you can then go and fix manually if desired:  

If you use --failure-threshold NONE then don't be surprised if Vertica won't start.  (Take your backups first if upgrading).  

Change SystemProfileFactory.py to ignore the Ubuntu OS check

This is much safer, but still...don't do it on your prod cluster.  Do all of this as root or equivalent.  

  1. Determine your OS version as reported by Ubuntu.  nano /etc/debian_version .  You'll see something similar to the screenshot.  Note the value returned.  It should be "jessie" for 14.  
  2. Modify the python script.  sudo nano /opt/vertica/oss/python/lib/python2.7/site-packages/vertica/network/SystemProfileFactory.py 
  3. Ctl+W (for WhereIs), then "wheezy" (wheezy is the codename for Ubuntu 12.04 LTS).  
  4. Replace "wheezy" with "jessie".  See the screenshot.  
  5. Ctl+O (for Writeout)
  6. Ctl+X (to exit)
  7. Rerun update_vertica or install_vertica as required.  

Again, these tips should not be done on prod.  I like all of my Ubuntu boxes to be on 14 which makes my Puppet scripts easier to control.  I've found no issues using Vertica with Ubuntu14 (in testing envs).  I've also found no issues using LVM or ext3 either, which also causes the profile checks to fail as well.  See above.  

YMMV.


You have just read "Workaround:  Vertica Error:  This version of Debian or Ubuntu is unsupported" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

Basic Vertica DBA Tasks

In this post in my Vertica series I'm going to cover some basic DBA tasks that you need to handle.  This is not a comprehensive list, just the absolutely critical items.  I've noticed that in many shops running "newer" database managers (especially NoSQL variants) that the developers are also responsible for administration tasks.  There just aren't that many full-time DBAs for these newer products.  The mindset of many NoSQL vendors is that you don't need a DBA...the system is self-healing and auto-tunable.  Vertica doesn't spew that mantra (that I've seen anyway)...however, there is no good reading material geared solely for the administration of Vertica either.  

Basic DBA tasks for any DBMS

These are the tasks that I feel are critical for any DBA, regardless of database platform:  

  1. Installing and upgrading the DBMS.  I already covered installing in Installing Vertica 7.  We'll cover upgrading Vertica and the underlying OS in a future post. 
  2. Backup/Restore.  I handled this in my post Vertica Backups.  
  3. Recovery.  This is slightly different than backup/restore in my mind.  When a DBMS comes up after a crash it will likely need to perform some sort of "recovery".  Every DBMS handles this differently.  Some are completely automated, some require manual intervention.  Vertica handles this automatically.  If you've handled k-safety properly then whenever a node comes back online (for whatever reason) Vertica will automatically recover it by copying data from other nodes.  This really eliminates the need for restoring data due to hardware or software failure...you only really need to restore a db when you need to create a new, copied environment or when you've accidentally committed bad data.  
  4. Reviewing the Logs.  Any good DBA should be reviewing the available DBMS logs periodically.  You can see partial logs in the Message Center in MC.  However, if you read the logs directly you can begin to think up methods to automate log management.  The two most important logs are vertica.log and dbLog.  To view these in Linux you use the tail command.  tail -f <catalog_path>/vertica.log and tail -f <cdb_path>/dbLog.  vertica.log will log almost any activity that the database manager performs.  dbLog will list just those messages specific to the database.  The former is almost unusably verbose (automation is your friend)...so the latter is what you really should be checking frequently.  
  5. Authentication and Authorization.  Vertica follows the DCL syntax of the ANSI standards and most DBMS vendors.  You can always use MC to handle your ANA needs.  
  6. Auditing.  Auditing is very specific to every organization.  Some have strict requirements, others don't care.  Good DBAs, even if auditing is not a functional requirement, audit various events if for no other reason than trending.  Vertica has rich auditing facilities...I simply know very little about them.  Sorry.  
  7. Performance Management.  I'll cover this in the next blog post Basic Vertica Performance Management Tasks.  
  8. Resolving connectivity issues.  Vertica is similar to SQL Server in this regard.  You must know the servername (clustername), database name, port, and user/password (if not using OS authentication).  Vertica clients connect to a built-in load balancer via a VIP (virtual IP) that is shared across Vertica cluster nodes.  This ensures that if a node fails clients can still work.  This is similar to a SQL Server cluster.  The IP Virtual Server (IPVS) is the load balancer in Vertica and it even allows adding and removing nodes on a running cluster without interrupting users.  This type of load balancer is akin to "round robin" which means that any single node can be handling inordinately more activity just through dumb luck.  Vertica also has "native connection load balancing" that is disabled by default but can actually ensure that CPU and memory overhead is honored across nodes too.  The caveat is that you have to use Vertica's client libraries and it takes a bit more work to set up.  
  9. License Management.  HP licenses Vertica by the size of the raw (uncompressed) data in the tables.  This doesn't include projection data, temp tables, the WOS (we'll cover that in a future post), or data not yet purged by TM (covered in the future too).  If your db's size nears the licensed limit you'll see warnings in admintools and vsql.  You need to then either upgrade your license or purge some data.  The beauty is that Vertica never just shuts down.  The most visible problem is that your queries will return a status code of SUCCESS_WITH_INFO instead of just SUCCESS.  If you code for that scenario then you can save yourself some grief.  You can run SELECT get_compliance_status(); to determine your licensing compliance.  

There are certainly other DBA tasks that could be added to this list, but this list comprises the absolute basics to be an entry-level Vertica DBA.  


You have just read "Basic Vertica DBA Tasks" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

Vertica Concurrency and Isolation Semantics

This is a brief blog post in my Vertica series covering how Vertica handles concurrency and isolation.  

Vertica supports the four standard SQL isolation levels (READ COMMITTED, READ UNCOMMITTED, REPEATABLE READ, and SERIALIZABLE).  However, it really only honors READ COMMITTED and SERIALIZABLE.  READ UNCOMMITTED is mapped to READ COMMITTED and REPEATABLE READ is up-sized to SERIALIZABLE.  So, if your app is, for some reason, sensitive to isolation levels, beware. This shouldn't affect the query workloads that Vertica targets, which is read-intensive data warehouse workloads (with a high amount of concurrent data loading). In a future post I'll cover how Vertica is architected such that it can handle massive data loads without affecting online query concurrency.  

Vertica supports SQL92-style implicit transactions.  This means that you don't really need a BEGIN TRAN clause, although you can certainly do that.  It supports SAVEPOINTs and ROLLBACKs in a manner that is semantically identical to SQL Server.

Snapshot Isolation
Vertica has the ability to run queries in snapshot isolation mode similar to SQL Server. It is still READ COMMITTED under the covers. The trick is to preface your queries with AT EPOCH LATEST (or a given time) to your query. When you do this you instruct Vertica to fetch the committed data as of the time the query started. Essentially you are querying a snapshot, although you don't need to create or declare the snapshot like you do in SQL Server. These queries should have the fastest possible execution. Essentially you will hold no locks and won't block write ops.  EPOCHs are advanced when new data is committed.  

Vertica written with AT EPOCH clauses are called "historical queries".  

More on Epochs

Let's take a deeper dive into epochs.  

Vertica's SQL dialect doesn't have a name (such as T-SQL or PL/SQL) that I am aware of.  Vertica's SQL is based off Postgres but it is a very limited subset of functionality.  It's syntactically similar to TSQL and follows most of the ANSI SQL standards.  it has CTE's (although not recursive), derived tables in FROM clauses, subqueries (correlated and non-correlated), etc etc.  You terminate all SQL statements with a semicolon.  

Vertica's SQL does not support variables.  The vsql utility supports host-like environment variables but that is useless if using DBeaver or embedding SQL in Java.  Instead you have to use a native programming language and handle your variablization there.  Because there are no variables I have this script mostly hardcoded for my test environment.  

select current_epoch from system; will show you the equivalent of the LSN for your system.  Whenever the next commit occurs it will, in this case, be 957.  This is a system-wide, not connection-specific, value.  

The epochs table will show you the current epochs that you can query against.  The oldest listed, in this case 956, is the oldest that can be queried and is called the Ancient History Mark (AHM).  

Note that creating a table does not advance the current_epoch.  However, when we INSERT a row into a table, and commit it, we do advance the current_epoch.  

The epochs table now shows that we are able to query two different epochs.  The AHM has not advanced.  

Just a quick note, similar to the bash shell or the autocomplete feature in SSMS if you type the first few letters of a command or object and then hit tab you can either autocomplete the command or the shell will give you a list of suggestions for you to continue typing if there are multiple possibilities.  In this case I typed "pub" and then tab and vsql showed me all contextual values that would be available for autocompletion...in this case objects in the public schema.  

After entering another row my current_epoch has again advanced and I can now query 3 different epochs.  

A basic SELECT against our table will show two rows.  

Now I can use the AT EPOCH statement, prepended to the SELECT to see the data at various snapshots in the past.  

If I choose EPOCH 955 I generate an error since that EPOCH is no longer available.  

EPOCH 956 shows me no rows.

EPOCH 957 shows me one row.

EPOCH 958 shows me two rows.  

If I try to query the current_epoch (959) I also generate an error.  

I forgot to grab a screenshot but AT EPOCH LATEST, in this example would be functionally equivalent to EPOCH 958.  AT EPOCH LATEST is assumed when the AT EPOCH clause is omitted.  It is functionally equivalent to saying, "show me all committed data as of the start of this query".  

To reiterate I can now query 3 epochs.  If you try this on your own you have to run your queres quickly...

...because if you wait about a minute you'll notice that your list of available epochs is automatically reduced.  I show the current_epoch to show you that this is a quiet system and I have done no backups, delete statements, or anything else to advance the current_epoch...this is a quiet system.  

Epochs, be default, will be available for about a minute until the Tuple Mover process comes along and cleans up older epochs.  This is totally configurable and we'll cover it more in a future post.  

Epochs is the implementation that allows Vertica to have a READ COMMITTED SNAPSHOT implementation.  You are always querying against the latest epoch if you don't specify an epoch.  Any currently executing INSERTs, UPDATEs, or DELETEs will be in the current_epoch (and there may be multiple if you are running concurrent data modifications) and are not available for querying outside of the current session scope.  

If you think about this, epochs are functionally equivalent to READ COMMITTED SNAPSHOT, it's simply a different implementation.  This is one reason why I enjoy Vertica...it helps me understand other database managers and their implementations of things such as concurrency and isolation management.  


You have just read "Vertica Concurrency and Isolation Semantics" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Tags: 

Pages

 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer