Tuesday, October 20, 2009

Problems with Parameters and the Command Object in ODBC and OLE DB

In testing a web service that retrieves a set of records a member of my project team came across an odd behavior.  The function that selected a list of records was slowing down when selecting a high volume of records 800-2000.  At the time we were using the ODBC Command object and placing parameters in to narrow the selection of records to the set that was to be returned.  By instrumenting the code with timestamps he was able to determine that the unexpected slowdown was in the retrieval of records.  The odd part about the slow down was that it was occurring somewhere around 300 records in when he was selecting 800 records. Somewhere around the 300th record there would just be about a 90 second pause in the retrieval. This was an odd boundary.  My first inclination was that there was something bad with a specific record that caused this.  However, on running multiple iterations it was not always the same record that was causing the slow down.  What was being selected was also only two fields.  One field was an int and the other was a string no larger than 128 characters. 

Having this information made the problem even stranger.  I suggested that he change the ODBC objects over to be the newer OLE DB objects figuring it might be a bug with the ODBC Command or with something with how ODBC Command was retrieving records.  The same behavior was demonstrated using OLE DB.  This made the problem even more puzzling.  My next thought was that there was some sort of locking taking place somewhere.  This seemed unlikely given the simplicity of the database accesses and the lack of long running transactions.  The repeatability also make this unlikely as well.  Again, my team member came back to me with a negative report. 

At this point I had him test it with retrieving more records.  The results of this were odd as well.  When he tested it with 1500 records the problem would happen at about 1000.  When he used 2000 records the pause happened at about 1500.  He couldn't test it with many more than 2000 records because that is the maximum number of parameters that can be put on a command object.  This was a very odd pattern to be exhibiting.  Why approximately 500 records from the end of the result set was a pause occurring.

It was at this time that the maximum of approximately 2000 records retrieved was realized because of the parameter limit.  The reason that this issue was first discovered was that we were looking to be able to run the method in the Web Service to retrieve 5000 records efficiently.  Obviously we could not have a 2000 record limit or we'd have to have multiple queries.  We however were not really using parameters for their intended purposes really.  One of the fatal flaws with all database connection objects I've seen in Java and in C# are that there is no way to easily push a list into an IN clause.  Instead you wind up writing value = 1 OR value = 2 etc.  What we were doing was just adding  where the values were and modifying the query string and then adding a parameter for each '?'.  This strikes me as bad to begin with.  If you are already modifying the query string you are not getting any benefits from having the compiled version cached.  Instead of entering the  '?' we should have just been using a StringBuilder and creating the query on the fly.  We really weren't using parameters the right way anyway. 

After some discussion we decided that making this change made sense in the code.  It was implemented building the condition dynamically and tested again.  Magically the 90 seconds of delay disappeared.  That meant we killed two problems at once.  Our delay that caused the Web Service to time out was gone and we were no longer capped at about 2000 records.  What did this mean about our problem though.  It appears that there is a bug with both ODBC and OLE that if you try to have more than 500 parameters or so, they will freeze for an undetermined amount of time, for no good reason.  You must therefore heavily consider what you are doing if you have 500 or more parameters.  Analyze your code and you probably will find that there is some other change that can be made to eliminate that many parameters and you should no longer have that problem.

No comments:

Post a Comment