Re: Different execution plans for semantically equivalent queries

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Mikkel Lauritsen <renard(at)tala(dot)dk>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Different execution plans for semantically equivalent queries
Date: 2011-02-07 10:47:26
Message-ID: AANLkTinBGzvvuJpG8hT+hKjKfWOBhE57g40Y0=DD6rYF@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Feb 7, 2011 at 00:03, Mikkel Lauritsen <renard(at)tala(dot)dk> wrote:
>>> SELECT * FROM table t1 WHERE 0 = (SELECT COUNT(*) FROM table t2 WHERE
>>>     t2.type = t1.type AND t2.timestamp > t1.timestamp)
>>
>> I suspect that *any* database is going to have trouble optimizing that.

> Just out of curiosity I've been looking a bit at the optimizer code
> in PostgreSQL, and it seems as if it would be at least theoretically
> possible to add support for things like transforming the query at
> hand into the NOT EXISTS form; a bit like how = NULL is converted
> to IS NULL.
>
> Would a change like that be accepted, or would you rather try to
> indirectly educate people into writing better SQL?

There are some reasonable and generic optimizations that could be done
here. Being able to inline subqueries with aggregates into joins would
be a good thing e.g. transform your query into this:

SELECT t1.* FROM table t1 JOIN table t2 ON (t2.type = t1.type)
WHERE t2.timestamp > t1.timestamp
GROUP BY t1.* HAVING COUNT(t2.*)=0

However, this is probably still worse than a NOT EXISTS query.

I am less excited about turning "COUNT(x)=0" query to NOT EXISTS
because that's just a bad way to write a query.

Regards,
Marti

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message felix 2011-02-07 15:05:07 Re: Really really slow select count(*)
Previous Message Marti Raudsepp 2011-02-07 10:30:25 Re: Really really slow select count(*)