Re: Query 4-5 times slower after ANALYZE

From: "Philippe Lang" <philippe(dot)lang(at)attiksystem(dot)ch>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query 4-5 times slower after ANALYZE
Date: 2009-03-18 14:46:00
Message-ID: E6A0649F1FBFA3408A37F505400E7AC215CDAB@email.attiksystem.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

pgsql-general-owner(at)postgresql(dot)org wrote:
> In response to "Philippe Lang" <philippe(dot)lang(at)attiksystem(dot)ch>:
>>
>> I'm using Postgresql 8.3.6 under Freebsd 7.1.
>>
>> After a fresh restore of a customer dump (running version 8.2.7 at
>> the moment), a rather big query executes in about 30 seconds. As
>> soon as I run ANALYZE, it is instantly 4-5 times slower. I could
>> check that multiples times.
>>
>> Here is the EXPLAIN ANALYZE before the ANALYZE:
>>
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/before
>> .txt
>>
>> And here the the EXPLAIN ANALYZE after the ANALYZE:
>>
http://www.attiksystem.ch/postgresql/query_slower_after_analyze/after.
>> txt
>>
>> Any idea what could be turned on/off in order not to have this
>> slowdown after the ANALYZE?
>
> I opened one of those links figuring I'd take a few minutes to see if
> I could muster up some advice ... and just started laughing ...
> definitely not the type of query that one can even understand in just
> a few minutes!
>
> Anyway, the real reason I posted -- I doubt if anyone will be able to
> make sense of a query plan that complex without the actual query, so
> you'll probably want to post it as well.

:) What? I thought you would read that like Neo was reading the
Matrix... :)

Bill, Gregory, Tomas and Sebastian, thanks for your answers. I have
tried everything your mentioned:

- increase seq_page_cost
- increase geqo_threshold
- increase join_collapse_limit
- increase from_collapse_limit

But it did not help (except disabling completely sequential scans), and
for a reason I think I understand better now: part of the query looks
like:

----------------
SELECT

c.id AS customer_id,
c.name AS customer_name,
d.id AS document_id,
d.number AS document_number,
d.vref AS document_vref,
dt.name AS type,
d.creation_date AS value_date

FROM documents AS d

LEFT JOIN payment_terms AS pt
ON d.payment_term_id = pt.id

INNER JOIN reminder_levels AS rl
ON d.reminder_level_id = rl.id

INNER JOIN document_types AS dt
ON d.document_type_id = dt.id

INNER JOIN projects AS p
ON d.project_id = p.id

INNER JOIN customers AS c
ON p.customer_id = c.id

WHERE d.reminder = 1

AND solde_po(CURRENT_DATE, c.id) > 0

AND d.creation_date <= CURRENT_DATE
----------------

The heavy part here is the "solde_po" call (at the end), which takes up
most CPU time. That's why scanning the customers table takes up so much
time. I imagine a small change in the way this table is scanned can have
enormous effects in the overall execution time, like when an sequential
scan is preferred over an index scan. Does that sound correct?

A small question here: solde_po is an SQL function (not PLPGSQL). Is it
"inlined" in the parent query before the whole query execution plan is
calculated? Or are they treated completely separately?

Philippe

P.S. Thanks for the link to "explain.depesz.com"! Great tool!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2009-03-18 14:49:21 Re: PostgreSql with or without Plus?
Previous Message Ray Stell 2009-03-18 14:45:46 Re: postmaster never finishes starting up, silent to boot