Re: Status of DISTINCT-by-hashing work

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Status of DISTINCT-by-hashing work
Date: 2008-08-05 15:50:42
Message-ID: 87bq078nql.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> I've pretty much finished the project I got a bee in my bonnet about
> last week, which is to teach SELECT DISTINCT how to (optionally) use
> hashing for grouping in the same way that GROUP BY has been able to do
> for awhile.
>
> There are still two places in the system that hard-wire the use of
> sorting for duplicate elimination:
>
> * Set operations (UNION/INTERSECT/EXCEPT)

Egads. Are you thinking to reimplement them more in line with the way other
nodes work? Or just have them choose between hashing and sorting themselves?

> * Aggregate functions with DISTINCT
>
> I'm thinking of trying to fix set operations before I leave this topic,
> but I'm not sure it's worth the trouble to change DISTINCT aggregates.
> They'd be a lot more work (since there's no executor infrastructure
> in place that could be used) and the return on investment seems low.
>
> Comments?

I recall being quite mystified by how distinct aggregates work when the sort
didn't appear anywhere in EXPLAIN output. If we could manage to expose that
info in the plan somehow it would be a great improvement even if we didn't
actually improve the plans available.

Any idea what would the needed executor infrastructure look like? Would it
have anything in common with the OLAP window functions infrastructure?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2008-08-05 16:03:25 Re: unnecessary code in_bt_split
Previous Message Asko Oja 2008-08-05 15:35:26 Re: Status of DISTINCT-by-hashing work