Re: POC: converting Lists into arrays

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: converting Lists into arrays
Date: 2019-05-25 03:26:33
Message-ID: CAKJS1f8=5GBvZp-WoWOU8aMvoELZgWMMe=p54oEAXpi_2-zCNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 25 May 2019 at 12:53, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Now, it turns out that the new formulation of foreach() is really
> strictly equivalent to
>
> for (int pos = 0; pos < list_length(list); pos++)
> {
> whatever-type item = list_nth(list, pos);
> ...
> }
>
> which means that it could cope fine with deletion of the current
> list element if we were to provide some supported way of not
> incrementing the list index counter. That is, instead of
> code that looks more or less like this:
>
> for (int pos = 0; pos < list_length(list); pos++)
> {
> whatever-type item = list_nth(list, pos);
> ...
> if (delete_cur)
> {
> list = list_delete_nth_cell(list, pos);
> pos--; /* keep loop in sync with deletion */
> }
> }
>
> we could write, say:
>
> foreach(lc, list)
> {
> whatever-type item = lfirst(lc);
> ...
> if (delete_cur)
> {
> list = list_delete_cell(list, lc);
> foreach_backup(lc); /* keep loop in sync with deletion */
> }
> }
>
> which is the same thing under the hood. I'm not quite sure if that way
> is better or not. It's more magical than explicitly manipulating a list
> index, but it's also shorter and therefore less subject to typos.

If we're doing an API break for this, wouldn't it be better to come up
with something that didn't have to shuffle list elements around every
time one is deleted?

For example, we could have a foreach_delete() that instead of taking a
pointer to a ListCell, it took a ListDeleteIterator which contained a
ListCell pointer and a Bitmapset, then just have a macro that marks a
list item as deleted (list_delete_current(di)) and have a final
cleanup at the end of the loop.

The cleanup operation can still use memmove, but just only move up
until the next bms_next_member on the deleted set, something like
(handwritten and untested):

void
list_finalize_delete(List *list, ListDeleteIterator *di)
{
int srcpos, curr, tarpos;

/* Zero the source and target list position markers */
srcpos = tarpos = 0;
curr = -1;
while ((curr = bms_next_member(di->deleted, curr) >= 0)
{
int n = curr - srcpos;
if (n > 0)
{
memmove(&list->elements[tarpos], &list->elements[srcpos],
n * sizeof(ListCell));
tarpos += n;
}
srcpos = curr + 1;
}
list->length = tarpos;
}

Or maybe we should worry about having the list in an inconsistent
state during the loop? e.g if the list is getting passed into a
function call to do something.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-05-25 04:37:54 Re: Performance issue in foreign-key-aware join estimation
Previous Message Amit Kapila 2019-05-25 03:13:28 Re: Fix link for v12