Re: Optimizer questions

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimizer questions
Date: 2016-01-06 09:03:10
Message-ID: CAKJS1f_ZaYGVBNW=_43Z+fhgvMgCqAX08J0T8Y5dAEHf4RXBMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6 January 2016 at 13:13, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
wrote:

> On Wed, Jan 6, 2016 at 12:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> writes:
>> > 1. The cost compared in grouping_planner doesn't take in account price
>> of get_authorized_users - it is not changed when I am altering function
>> cost. Is it correct behavior?
>>
>> The general problem of accounting for tlist eval cost is not handled very
>> well now, but especially not with respect to the idea that different paths
>> might have different tlist costs. I'm working on an upper-planner rewrite
>> which should make this better, or at least make it practical to make it
>> better.
>>
>
> Hmm... Besides costing it would be nice to postpone calculation of
> expensive tlist functions after LIMIT.
>

I'd agree that it would be more than the costings that would need to be
improved here.

The most simple demonstration of the problem I can think of is, if I apply
the following:

diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 29d92a7..2ec9822 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -641,6 +641,8 @@ int4pl(PG_FUNCTION_ARGS)

result = arg1 + arg2;

+ elog(NOTICE, "int4pl(%d, %d)", arg1,arg2);
+
/*
* Overflow check. If the inputs are of different signs then their
sum
* cannot overflow. If the inputs are of the same sign, their sum
had

Then do:

create table a (b int);
insert into a select generate_series(1,10);
select b+b as bb from a order by b limit 1;
NOTICE: int4pl(1, 1)
NOTICE: int4pl(2, 2)
NOTICE: int4pl(3, 3)
NOTICE: int4pl(4, 4)
NOTICE: int4pl(5, 5)
NOTICE: int4pl(6, 6)
NOTICE: int4pl(7, 7)
NOTICE: int4pl(8, 8)
NOTICE: int4pl(9, 9)
NOTICE: int4pl(10, 10)
bb
----
2
(1 row)

We can see that int4pl() is needlessly called 9 times. Although, I think
this does only apply to queries with LIMIT. I agree that it does seem like
an interesting route for optimisation.

It seems worthwhile to investigate how we might go about improving this so
that the evaluation of the target list happens after LIMIT, at least for
the columns which are not required before LIMIT.

Konstantin, are you thinking of looking into this more, with plans to
implement code to improve this?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-01-06 09:17:05 Comment typo in namespace.c
Previous Message Michael Paquier 2016-01-06 09:00:23 Re: Function and view to retrieve WAL receiver status