Re: POC: converting Lists into arrays

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: converting Lists into arrays
Date: 2019-07-22 17:46:26
Message-ID: 3890.1563817586@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> So with this patch we end up with:

> list_union (copies list1, appends list2 element not already in list1)
> list_concat_unique (appends list2 elements not already in list)
> list_concat (appends all list2 elements)
> list_concat_copy (copies list1, appends all list2 elements)

> This seems a little random -- for example we end up with "union" being
> the same as "concat_copy" except for the copy; and the analogy between
> those two seems to exactly correspond to that between "concat_unique"
> and "concat".

Yeah, list_concat_unique is kind of weird here. Its header comment
even points out that it's much like list_union:

* This is almost the same functionality as list_union(), but list1 is
* modified in-place rather than being copied. However, callers of this
* function may have strict ordering expectations -- i.e. that the relative
* order of those list2 elements that are not duplicates is preserved.

I think that last sentence is bogus --- does anybody really think
people have been careful not to assume anything about the ordering
of list_union results?

> I would propose to use the name list_union, with flags
> being "unique" (or "uniquify" if that's a word, or even just "all" which
> seems obvious to people with a SQL background), and something that
> suggests "copy_first".

I really dislike using "union" for something that doesn't have the
same semantics as SQL's UNION (ie guaranteed duplicate elimination);
so I've never been that happy with "list_union" and "list_difference".
Propagating that into things that aren't doing any dup-elimination
at all seems very wrong.

Also, a big -1 for replacing these calls with something with
extra parameter(s). That's going to be verbose, and not any
more readable, and probably slower because the called code
will have to figure out what to do.

Perhaps there's an argument for doing something to change the behavior
of list_union and list_difference and friends. Not sure --- it could
be a foot-gun for back-patching. I'm already worried about the risk
of back-patching code that assumes the new semantics of list_concat.
(Which might be a good argument for renaming it to something else?
Just not list_union, please.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-07-22 18:11:29 Re: using explicit_bzero
Previous Message Andres Freund 2019-07-22 17:40:42 Re: initdb recommendations