Re: vacuum analyze slows sql query

From: patrick ~ <sidsrr(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Greg Stark <gsstark(at)mit(dot)edu>
Subject: Re: vacuum analyze slows sql query
Date: 2004-11-03 22:22:57
Message-ID: 20041103222257.73682.qmail@web52104.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Here is a fresh run with 'explain analyze' run before and after the
VACUUM statement:

-- begin
% dropdb pkk
DROP DATABASE
% createdb pkk
CREATE DATABASE
% psql pkk < pkk_db.sql
ERROR: function pkk_offer_has_pending_purch(integer) does not exist
ERROR: function pkk_offer_has_pending_purch2(integer) does not exist
ERROR: table "pkk_billing" does not exist
ERROR: table "pkk_purchase" does not exist
ERROR: table "pkk_offer" does not exist
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pkk_offer_pkey"
for table "pkk_offer"
CREATE TABLE
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"pkk_purchase_pkey" for table "pkk_purchase"
CREATE TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE TABLE
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
% zcat pkk.20041028_00.sql.gz | psql pkk
SET
SET
SET
SET
% psql pkk
pkk=# select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ;
<ommitting output />
(618 rows)

Time: 877.348 ms
pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on pkk_offer (cost=0.00..22.50 rows=1000 width=4) (actual
time=1.291..845.485 rows=618 loops=1)
Total runtime: 849.475 ms
(2 rows)

Time: 866.613 ms
pkk=# vacuum analyze ;
VACUUM
Time: 99344.399 ms
pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id )
from pkk_offer ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on pkk_offer (cost=0.00..13.72 rows=618 width=4) (actual
time=3636.401..1047412.851 rows=618 loops=1)
Total runtime: 1047415.525 ms
(2 rows)

Time: 1047489.477 ms
-- end

Tom,

The reason of the extra "case" part in the function is to ensure non-null
fields on the result. I tried your version as well and i get similar
performance results:

-- begin
pkk=# create function toms_pending_purch( integer ) returns bool as 'select
p0.purchase_id is not null from pkk_purchase p0 where p0.offer_id = $1 and (
p0.pending = true or ( ( p0.expire_time > now() or p0.expire_time isnull ) and
p0.cancel_date isnull ) ) limit 1 ' language 'sql' ;
CREATE FUNCTION
Time: 2.496 ms
pkk=# select offer_id, toms_pending_purch( offer_id ) from pkk_offer ;
(618 rows)

Time: 1052339.506 ms
-- end

Right now, I'm studying the document section on PREPARE and will
attempt to play around with it.

I was asked (in a prior post) whether running the statement a second
time after the VACUUM improves in performance. It does not. After
the VACUUM the statement remains slow.

Thanks for your help,
--patrick

--- Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> patrick ~ <sidsrr(at)yahoo(dot)com> writes:
> > that if I 'createdb' and populate it with the "sanatized" data the
> > query in question is quite fast; 618 rows returned in 864.522 ms.
> > This was puzzling. Next I noticed that after a VACUUM the very same
> > query would slow down to a crawl; 618 rows returned in 1080688.921 ms).
>
> The outer query is too simple to have more than one possible plan,
> so the issue is certainly a change in query plans inside the function.
> You need to be investigating what's happening inside that function.
> 7.1 doesn't have adequate tools for this, but in 7.4 you can use
> PREPARE and EXPLAIN ANALYZE EXECUTE to examine the query plans used
> for parameterized statements, which is what you've got here.
>
> My bet is that with ANALYZE stats present, the planner guesses wrong
> about which index to use; but without looking at EXPLAIN ANALYZE output
> there's no way to be sure.
>
> BTW, why the bizarrely complicated substitute for a NOT NULL test?
> ISTM you only need
>
> create function
> pkk_offer_has_pending_purch( integer )
> returns bool
> as '
> select p0.purchase_id is not null
> from pkk_purchase p0
> where p0.offer_id = $1
> and ( p0.pending = true
> or ( ( p0.expire_time > now()
> or p0.expire_time isnull )
> and p0.cancel_date isnull ) )
> limit 1
> ' language 'sql' ;
>
> (Actually, seeing that pkk_purchase.purchase_id is defined as NOT NULL,
> I wonder why the function exists at all ... but I suppose you've
> "stripped" the function to the point of being nonsense.)
>
> regards, tom lane


__________________________________
Do you Yahoo!?
Check out the new Yahoo! Front Page.
www.yahoo.com

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message John A Meinel 2004-11-03 23:25:27 Re: Restricting Postgres
Previous Message Martin Foster 2004-11-03 21:25:45 Re: Restricting Postgres