Re: NULL-handling in aggregate(DISTINCT ...)

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NULL-handling in aggregate(DISTINCT ...)
Date: 2009-11-12 05:22:28
Message-ID: 87k4xwtj49.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> I think you could probably just change it: make DISTINCT work as
Tom> per regular DISTINCT (treat null like a value, keep one copy).
Tom> All the spec-conforming aggregates are strict and would ignore
Tom> the null in the next step anyway.

>> Change it for single-arg DISTINCT too? And the resulting change to the
>> established behaviour of array_agg is acceptable? Just want to be clear
>> here.

Tom> I doubt that very many people are depending on the behavior of
Tom> array_agg(DISTINCT); and anyway it could be argued that the
Tom> present behavior is a bug, since it doesn't work like standard
Tom> DISTINCT. I don't see a problem with changing it, though it
Tom> should be release-noted.

A followup question: currently the code uses the "datum" interface for
tuplesort. Obviously with multiple columns the slot interface is used
instead; but is there any performance advantage for staying with the
datum interface for the single-column case?

--
Andrew.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikhil Sontakke 2009-11-12 06:57:00 Re: Patch committers
Previous Message A.M. 2009-11-12 04:36:46 Re: Listen / Notify rewrite