Re: Slow queries in PL/PGSQL function

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Jim Crate <jcrate(at)deepskytech(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Slow queries in PL/PGSQL function
Date: 2004-02-20 14:20:24
Message-ID: 27414.1077286824@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Richard Huxton <dev(at)archonet(dot)com> writes:
> On Thursday 19 February 2004 23:00, Jim Crate wrote:
>> explain analyze SELECT DISTINCT i_ip
>> FROM x_rbl_ips
>> LEFT JOIN filter_ips ON x_rbl_ips.i_ip = filter_ips.i_filter_ip
>> WHERE x_rbl_ips.dts_last_modified > '2004-02-18 22:24:15.901689+00'
>> AND filter_ips.i_filter_ip IS NOT NULL
>> AND (i_filter_type_flags & X'02000000'::integer) <> X'02000000'::integer

> I'm guessing that the values in your query are variables/parameters in the
> plpgsql function? The problem is that the plan is compiled when the function
> is first run, so it doesn't know what values you will use. You might tend to
> use values that make sense to index, but it can't tell.

Specifically, the only part of that that looks indexable is the
dts_last_modified constraint. If it's always "dts_last_modified >
some-time-in-the-recent-past" then the range of values scanned is going
to be small enough to make an indexscan worthwhile. But if what the
planner sees is "dts_last_modified > variable" then it isn't gonna risk
an indexscan, because in the general case that could mean scanning a
large part of the table, and the indexscan would be tremendously slow.

What you can do to work around this (I'm assuming dts_last_modified
never contains values in the future) is add a dummy constraint:

WHERE x_rbl_ips.dts_last_modified > variable
AND x_rbl_ips.dts_last_modified <= now()
AND other-stuff

Now what the planner sees is dts_last_modified being constrained to a
range, rather than an open-ended interval. It still has to guess about
how much of the index will be scanned, but its guess in this scenario
is much smaller and it should usually pick the indexscan instead.

BTW, in recent releases you can investigate planner choices involving
queries with variables by using PREPARE and EXPLAIN EXECUTE. For
example

PREPARE myq(timestamptz) AS
...
WHERE x_rbl_ips.dts_last_modified > $1
...

EXPLAIN EXECUTE myq('2004-02-18 22:24:15.901689+00');

This allows you to exactly reproduce the conditions that the planner has
to work under when planning a query from a plpgsql function.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tibor 2004-02-20 15:26:19 How can I delete a primary or foreign key?
Previous Message Richard Huxton 2004-02-20 09:37:17 Re: Slow queries in PL/PGSQL function