Re: Mystery: functions are slow with group by, but...

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Allan Kelly <allan(dot)kelly(at)buildstore(dot)co(dot)uk>
Cc: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Mystery: functions are slow with group by, but...
Date: 2000-04-18 18:02:02
Message-ID: 20624.956080922@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Allan Kelly <allan(dot)kelly(at)buildstore(dot)co(dot)uk> writes:
> I have a big performance problem using a function:

> This takes 5-10 _minutes_ :
> select a4, total_live_a4_properties( a4 )
> from properties group by a4;

> There are 5132 records in the properties table, and 64 distinct a4 values.

> However this next one is < 3 _seconds_!
> select a4, count(*), (100*count(*)/total_live_a4_properties( a4 )) as percentage
> from properties group by a4;

> Can anyone tell me what's going on?

You were right to look to EXPLAIN for a hint about the problem, but
unfortunately EXPLAIN doesn't show you anything about where a function
call is evaluated in the plan tree. If you dig into the EXPLAIN VERBOSE
output, you can see where it's evaluated. It turns out that in the
first query, 6.5.* evaluates the function call in the seqscan's target
list, which means that the function is evaluated separately at each row
of the table :-(. Then, all the duplicate rows are thrown away by the
group step, so you just did 5000+ useless function evaluations, each of
which needed a scan of the table. In the second case, the behavior is
different because of the presence of an aggregate function --- the
planner knows it cannot evaluate that until the top level, so the whole
expression including the user function call is done at the output of the
aggregate step. Net result, only one function evaluation per group.

This is, of course, pretty braindead. It's fixed in 7.0: functions on
group columns are not evaluated until after the group step, with or
without aggregates.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Michael S. Kelly 2000-04-18 18:17:02 RE: Outer joins
Previous Message kaiq 2000-04-18 17:54:14 Re: Outer joins