Re: Status of DISTINCT-by-hashing work

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Status of DISTINCT-by-hashing work
Date: 2008-08-06 19:16:58
Message-ID: 17158.1218050218@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> ... For INTERSECT/EXCEPT (with or without ALL),
> you really need to maintain counters in each hashtable entry so you know
> how many matching rows you got from each side of the set operation.
> So it'd be necessary to either duplicate a large chunk of nodeAgg.c, or
> make that code handle hashed INTERSECT/EXCEPT along with all its
> existing duties. Neither of which seems particularly appealing :-(.
> I'm going to look at whether nodeAgg can be refactored to avoid this,
> but I'm feeling a bit discouraged about it at the moment.

Actually, it seems that most of what could be shared has already been
factored out into execGrouping.c. I find that supporting hashing in
nodeSetOp.c will only roughly double its size (from 318 to 650 lines).
Although nodeAgg.c is about 1700 lines, most of its bulk comes from
managing the aggregate transition values and function calls. There
might be some scope to save a few lines by refactoring, but it doesn't
look like it's worth the trouble.

The attached WIP patch compiles, but I've not tested it yet for lack
of planner support. If some of the code looks suspiciously like
nodeAgg.c, it's because I started from nodeAgg and just stripped
everything that wasn't needed ...

If there are no objections, I'll push forward with persuading
the planner to support hashable set operations.

regards, tom lane

Attachment Content-Type Size
hashed-setops-1.patch.gz application/octet-stream 7.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2008-08-06 19:29:37 Re: plan invalidation vs stored procedures
Previous Message Steve Mitchell 2008-08-06 19:13:40 ambulkinsert