Re: Discussion on missing optimizations

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Adam Brusselback <adambrusselback(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Discussion on missing optimizations
Date: 2017-10-07 02:19:54
Message-ID: 7105.1507342794@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2017-10-06 21:33:16 -0400, Adam Brusselback wrote:
>> The article in question is here:
>> https://blog.jooq.org/2017/09/28/10-cool-sql-optimisations-that-do-not-depend-on-the-cost-model/

> That's interesting.

The impression I have in a quick scan is that probably hardly any of these
are cases that any of the DB designers think are important in themselves.
Rather, they fall out of more general optimization attempts, or not,
depending on the optimization mechanisms in use in a particular DB.
For example, reducing "WHERE 1=1" to "WHERE TRUE" and then to nothing
comes out of a constant-subexpression-precalculation mechanism for us,
whereas "WHERE column=column" doesn't fall to that approach. ISTM it
would be really dumb to expend planner cycles looking specifically for
that case, so I guess that DB2 et al are finding it as a side-effect of
some more general optimization ... I wonder what that is?

(edit: a few minutes later, I seem to remember that equivclass.c has
to do something special with the X=X case, so maybe it could do
something else special instead, with little new overhead.)

> (i.e. combining a IN (a,b,c) and a IN (c,d,f) into a IN (c), similar
> with OR)

> I can't remember proposals about adding this, but it seems worthwhile to
> consider. I think we should be able to check for this without a lot of
> planner overhead.

It would be enormously expensive to check that in the general case with
a bunch of entries in each IN list. Perhaps it would be OK to add on
the presumption that few queries would contain multiple IN's on the same
column in the first place, though. Not sure.

> 9. Unneeded Self JOIN

> Can't remember discussions of this.

I can't get very excited about that one either.

In the end, what the article fails to consider is that all of these are
tradeoffs, not unalloyed goods. If you spend planner cycles on every
query to look for cases that only the most unabashedly brain-dead ORMs
ever generate, you're not really doing your users a favor on balance.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-10-07 02:38:55 Re: Discussion on missing optimizations
Previous Message Andres Freund 2017-10-07 01:48:36 Re: Discussion on missing optimizations