Re: Consolidate 'unique array values' logic into a reusable function?

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Consolidate 'unique array values' logic into a reusable function?
Date: 2019-08-30 03:34:52
Message-ID: CA+hUKGK_kwiS+3VCeMMGAKg=27T1v17ABzt+xDa1qeW7W7wruA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I'm reviving a thread from 2016, because I wanted this thing again today.

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> > Here's a sketch patch that creates a function array_unique which takes
> > the same arguments as qsort or qsort_arg and returns the new length.
>
> Hmm ... I'd be against using this in backend/regex/, because I still
> have hopes of converting that to a standalone library someday (and
> in any case it needs to stay compatible with Tcl's copy of the code).
> But otherwise this seems like a reasonable proposal.
>
> As for the function name, maybe "qunique()" to go with "qsort()"?
> I'm not thrilled with "array_unique" because that sounds like it
> is meant for Postgres' array data types.

OK, here it is renamed to qunique() and qunique_arg(). It's a bit odd
because it has nothing to do with the quicksort algorithm, but make
some sense because it's always used with qsort(). I suppose we could
save a few more lines if there were a qsort_unique() function that
does both, since the arguments are identical. I also moved it into a
new header lib/qunique.h. Any better ideas for where it should live?
I removed the hunk under regex.

One thing I checked is that on my system it is inlined along with the
comparator when that is visible, so no performance should be lost by
throwing away the open coded versions. This makes me think that eg
oid_cmp() should probably be defined in a header; clearly we're also
carrying a few functions that should be consolidated into a new
int32_cmp() function, somewhere, too. (It might also be interesting
to use the pg_attribute_always_inline trick to instantiate some common
qsort() specialisations for a bit of speed-up, but that's another
topic.)

Adding to CF.

--
Thomas Munro
https://enterprisedb.com

Attachment Content-Type Size
0001-Consolidate-code-that-makes-a-sorted-array-unique.patch application/x-patch 16.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-08-30 04:29:48 Re: Improve error detections in TAP tests by spreading safe_psql
Previous Message Peter Geoghegan 2019-08-30 03:28:02 Re: Yet another fast GiST build