Thursday, February 16, 2012

being able to stop parallelism in a C++ Program calling SQL Server

Hi All,

I have written ETL software that runs on SQL Server. We are running it for the first time on a 4cpu (2 x dual core) machine on sql server 2005.

One of the things this software does is perform a 'select * from tablename' to validate that the tables passed to it as parameters exist. This has worked fine on previous releases and on single cpu machines because what the optimiser decides to do is to return just the first page of data and then fetch more. I guess it even works in 2005 standard edition.

However, 2005 enterprise edition allows parallelism. And what the optimiser is deciding to do with such a query is to parallelise it and fetch all rows and then give the result back to the program. So, instead of seeing a fraction of a second to return the first page of data we are seeing up to 90 seconds and the database goes and fetches 15M rows in parallel.

Obviously, what we would like to do is to somehow tell the optimiser that this set of programs should not perform any parallel queries. Or, we would like to turn parallelism off on the specific tables we are dealing with for the period of running these ETL programs....they have no need of parallel processing at the database level for virtually all the calls that are performed.

Would someone please be so kind as to advise us if we can do something like pass a parameter to ODBC to stop parallelism or if we can issue commands against specific tables to stop parallelism for a period and then turn it back on?

Thanks in Advance.

Peter Nolan

www.peternolan.com

select * from tablename OPTION (MAXDOP 1)

That should do it. MAXDOP stands for Max Degree of Parallelism, and allows you to limit the number of parallel execution threads for a query. Setting it to 1 essentially disables parallelism for that query.
|||

Hi David,

thanks very much for that....I did not know the syntax of how to do this at query level....

We turned off parallelism at the server level and it turns out the optimiser will still perform a table scan....we have many logical tables in on physical table and even when we query the underlying table with a select * from with a where clause it is scanning the table and taking about 40 seconds to do so....even when we say 'top 1' to make it get the first row and put an index on the field that logically partitions the table.

It looks like we will have to introduce partitioning so that the optimiser can know to scan just the partition in question. We also had a feature we planned to introduce which was to turn off table validation....if an etl subsystem is in production there is little need to validate tables....we never put it in...and now there seems like a need to do so.

Again, thanks for the tip. Much appreciated.

Peter

www.peternolan.com

|||

Hi All,

one of our guys here came up with a better idea....

What the tool is doing is validating the existence of the taget table/views passed to the program....so in the case of a typo a specific message saying the table/view not found is issued.

However the optimiser is thinking the program really wants all the rows.

One of the guys here suggested if the constraint where 1 = 0 is added to the select * then all optimisers will be smart enough to know no rows will be returned yet the database will still check that the table/view exists.

So we will also implement this test and that will make sure all programs test for existence of a table quickly.

If we feel industrious one day we can actually ask ODBC if the table exists.

One reason why I like the select * from tablename as a test for existence of a table over just asking ODBC is that it forces the preparation of the plan and if the view has been invalidated then it will also be caught at this stage of processing where ODBC will return a positive answer to does the view exist...

Again, thanks for your responses...it is much appreciated...

Just by the way....we installed the new performance dashboards just released and it was these that managed to show us the statement that was in a wait state and how we detected the problem...so if you have not installed the new performance dashboards yet....you might want to do so...

Best Regards

Peter

No comments:

Post a Comment