I just had a discussion with my boss, we are running a query that fetches
about 5000 objects from the database using MSSQL and DotNet 2.0.
My boss thinks that the recommendation of keeping the connections open for
as short a time as possible means that one shoule open a connection, fetch
one row, close the connection, open another connection, fetch one row and
close the connection and so on.
I belive that the extra overhead of opening and closing connections all the
time, of generating execution plans for each query, and running the queries
many times gives his solution a large performance hit. My recommendation
will be to fetch all the rows needed in one large SQL query.
I've done several other projects where I have proven this to be true, there
is indeed a large performance hit from generating many small queries instead
of one large. But my boss just says "no" and disagrees.
Can you give me some good arguments and/or point me to some best practice
documents that describe this so I can convince my boss he's wrong?
Kind Regards,
Allan Ebdrup
FUT: microsoft.public.sqlserver.programming"Allan Ebdrup" <ebdrup@.noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@.TK2MSFTNGP02.phx.gbl...
>I just had a discussion with my boss, we are running a query that fetches
>about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
> the time, of generating execution plans for each query, and running the
> queries many times gives his solution a large performance hit. My
> recommendation will be to fetch all the rows needed in one large SQL
> query.
> I've done several other projects where I have proven this to be true,
> there is indeed a large performance hit from generating many small queries
> instead of one large. But my boss just says "no" and disagrees.
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>
Don't argue with your boss. Implement it both ways and show him.
David|||"Allan Ebdrup" <ebdrup@.noemail.noemail> wrote in
news:#kIhprXWGHA.3332@.TK2MSFTNGP02.phx.gbl:
> I've done several other projects where I have proven this to be true,
> there is indeed a large performance hit from generating many small
> queries instead of one large. But my boss just says "no" and disagrees
I would have submitted one large bulk query as well.
Perhaps multiple small queries would reduce the time a table is locked?
Any other solutions? Maybe open a connection - execute a long running SP...
and fetch the results after the fact?|||> Don't argue with your boss. Implement it both ways and show him.
Hi David
Unfortunately it's him who's doing the implementing.|||My post just now seems to have got lost (wonder where it went).
To summarise what I said:
your boss is making a massive mistake. I've seen this tactic destroy a
large application.
it's fine at first, your queries all run nearly instantaneously, and
you'll load test it later, no need to worry about that right now.
then a table gets big. suddenly the instant query takes 50 milliseconds
to get the data. not a problem if you're only doing one select. if
you're doing 5000 separate ones it takes 5000x50 milliseconds.|||IF your query is executing 5000 times against the same tables and retrieving
the same columns with selection criteria against the same columns, then
running a single query to retreive all 5000 rows would be faster, without
question.
i.e.
select employee_id, employee_name from employees
Will have a connection open for much less time than cycling through each
employee with:
select employee_id, employee_name from employee where employee_id =
'SOMEID'
The first one will also use less CPU on the database, less IO, less network
traffic, etc. You can play with the code to control how you open the
dataset once you have retrieved it, but the first SQL will almost always be
faster. The only exception I can think of is if your client does not have
enough memory to handle the entire dataset.
If you have more complicated logic, and you are running against all
different tables, and combining this into one query means extensive joins
and business logic in the where clause, then you need to test each approach
and see what the performance gain is with the view.
Opening and closing connections for each communication wiht the database is
usually most efficient in my experience. Connection pooling manages the
resources really well. The connections do not really get opened and closed,
rather they stay open and available, but the client is able to free up some
of the resources associated with them between each call. This assumes that
you actually need to execute multiple queries to begin with.
Making 5000 seperate calls to the database will use much more network
resources than making a single call to the database to retreive the same
data. That part is a no brainer. Dependign on what you have for SQL, it
will not necessarily need to generate a plan every time (one would hope you
are using stored procedures with parameters and the plans are cached).
However, there would be some small overhead associated with running each
command.
"Allan Ebdrup" <ebdrup@.noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@.TK2MSFTNGP02.phx.gbl...
> I just had a discussion with my boss, we are running a query that fetches
> about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
the
> time, of generating execution plans for each query, and running the
queries
> many times gives his solution a large performance hit. My recommendation
> will be to fetch all the rows needed in one large SQL query.
> I've done several other projects where I have proven this to be true,
there
> is indeed a large performance hit from generating many small queries
instead
> of one large. But my boss just says "no" and disagrees.
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
> Kind Regards,
> Allan Ebdrup
> FUT: microsoft.public.sqlserver.programming
>|||SQL Server is quite capable of handling long connections, and as long as
they're actually doing something they make perfect sense. Many short lasting
connections may seem to put less stress on the server, but in fact the
overhead of many connections versus a single connection, when calls to the
server constitute a single business operation, must be multiplied with the
number of connections to give a true estimate of the actual (and expected)
stress on the server.
Will puts it very simply (and very true): 5000 connections where each lasts
50ms might kick the server pretty hard. 2nd grade mathematics.
ML
http://milambda.blogspot.com/|||If the calls to the database are made against isolated objects (i.e. against
different tables or to procedures that target different tables), then
individual connections may benefit from being initiated asynchronously. But
it would make more sense in issuing a single call to related objects for a
single business operation. At least IMHO.
ML
http://milambda.blogspot.com/|||All posts made a good point and I agree that your boss is wrong. It's a no
wonder he is a not a SQL Programmer but he is your boss. He signs your
paycheck so you may not have a choice. I often had to bow to what my boss
wants, hate it and it sucks but what can I do.
Grant
Who gives a {censored} if I am wrong.
"Allan Ebdrup" <ebdrup@.noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@.TK2MSFTNGP02.phx.gbl...
>I just had a discussion with my boss, we are running a query that fetches
>about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
> the time, of generating execution plans for each query, and running the
> queries many times gives his solution a large performance hit. My
> recommendation will be to fetch all the rows needed in one large SQL
> query.
> I've done several other projects where I have proven this to be true,
> there is indeed a large performance hit from generating many small queries
> instead of one large. But my boss just says "no" and disagrees.
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
> Kind Regards,
> Allan Ebdrup
> FUT: microsoft.public.sqlserver.programming
>|||If responsiveness of your application is important, you might consider to
fetch a certain amount of objects per fetch (say 100 objects, depending on
the size of your objects), instead of fetching 5000 objects together.
If you are going to get 5000 objects at once, while responsiveness is not a
problem, you should fetch all rows using one query.
Martin C K Poon
Senior Analyst Programmer
====================================
"Allan Ebdrup" <ebdrup@.noemail.noemail> bl
news:%23kIhprXWGHA.3332@.TK2MSFTNGP02.phx.gbl g...
> I just had a discussion with my boss, we are running a query that fetches
> about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
the
> time, of generating execution plans for each query, and running the
queries
> many times gives his solution a large performance hit. My recommendation
> will be to fetch all the rows needed in one large SQL query.
> I've done several other projects where I have proven this to be true,
there
> is indeed a large performance hit from generating many small queries
instead
> of one large. But my boss just says "no" and disagrees.
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
> Kind Regards,
> Allan Ebdrup
> FUT: microsoft.public.sqlserver.programming
>sql
No comments:
Post a Comment