Re: Seemingly identical queries run at different speeds

From: Arguile <arguile(at)lucentstudios(dot)com>
To: VanL <vlindberg(at)verio(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Seemingly identical queries run at different speeds
Date: 2003-07-23 01:06:11
Message-ID: 1058922372.4970.16763.camel@broadswd
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 2003-07-22 at 19:27, VanL wrote:
> I have three queries that are essentially identical. Two of them run in
> fractions of a second; one of them takes longer than 15 minutes to
> return. (Not sure how long it totally takes, that has been the limit of
> my patience.)
>
> The only difference between these queries is the use of table aliases in
> the sql query. What is happening in postgres that this makes such a
> difference?

You're getting bit by a 'feature' that is supposed to make life easier.
PostgreSQL automatically adds tables you reference to the FROM clause if
they're not listed there already, it then sends a NOTICE saying it did
so. This is supposed to be helpful, here it certainly is not.

What's happening in the below is you reference mm_batch, mm_domain, etc.
which are _different_ than B, D, etc. because aliasing changes what
query processor refers to them as (allowing stuff like self-joins).

So Pg is appending them to the FROM clause, which now looks like this:

FROM mm_batch B, mm_domain D, mm_management_unit M, mm_customer C,
mm_legacy_account LA, mm_target_account TA, mm_batch, mm_domain,
mm_management_unit, mm_customer, mm_legacy_account,
mm_target_account

Yikes. It's now generating an absolutely huge cartesian product between
the result set you want and all those extra tables.

I believe in the upcoming 7.4.x release this won't happen unless you
enable ADD_MISSING_FROM in your conf. Something I certainly won't be
enabling due to this type of situation.

> SLOW ( > 15 minutes):
> select
> mm_batch.name as batch_name,
> mm_domain.name as domain_name,
> mm_management_unit.name as management_unit_name,
> mm_customer.firstname as customer_name,
> mm_legacy_account.username as old_username,
> mm_target_account.username as new_username
> from
> mm_batch B,
> mm_domain D,
> mm_management_unit M,
> mm_customer C,
> mm_legacy_account LA,
> mm_target_account TA
> where
> mm_domain.bid = mm_batch.id
> and mm_domain.mid = mm_management_unit.id
> and mm_domain.cid = mm_customer.id
> and mm_domain.lid = mm_legacy_account.id
> and mm_domain.tid = mm_target_account.id
> and mm_domain.name = 'example.com';

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2003-07-23 01:43:50 Re: New Poll @ Codewalkers
Previous Message VanL 2003-07-22 23:27:37 Seemingly identical queries run at different speeds