Quick Links

Re: Distinct + Limit

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Francois Deliege <fdeliege(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Distinct + Limit
Date:	2012-03-28 14:13:01
Message-ID:	22625.1332943981@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Francois Deliege <fdeliege(at)gmail(dot)com> writes:
> I have the following table with millions of rows:

> CREATE TABLE table1
> (
> col1 text,
> col2 text,
> col3 text,
> col4 text,
> col5 text,
> col6 text
> )

> select col1 from table1 group by col1 limit 1;
> select distinct on (col1) col1 from table1 limit 1;

> select col1 from table1 group by col1 limit 2;
> select distinct on (col1) col1 from table1 limit 2;

> Performing any of these following queries results in a full sequential
scan, followed by a hash aggregate, and then the limit.

Well, if you had an index on the column, you would get a significantly
better plan ...

> Similarly, the following query results in a sequential scan:

> select * from table1 where col1 <> col1;

> This query is generated by the Sequel library abstraction layer in Ruby when filtering record based on a empty array of values. We fixed this by handling that case on the client side, but originally thought the server would have rewritten it and sent a empty result set.

It does not, and never will, because that would be an incorrect
optimization. "col1 <> col1" isn't constant false, it's more like
"col1 is not null". I'd suggest "WHERE FALSE", or "WHERE 1 <> 1"
if you must, to generate a provably false constraint.

regards, tom lane

In response to

Re: Distinct + Limit at 2012-03-27 20:54:36 from Francois Deliege

Responses

Re: Distinct + Limit at 2012-03-28 16:39:57 from Merlin Moncure

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Merlin Moncure	2012-03-28 16:39:57	Re: Distinct + Limit
Previous Message	Ants Aasma	2012-03-28 00:18:24	Re: Distinct + Limit