Re: Adding doubly linked list type which stores the number of items in the list

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding doubly linked list type which stores the number of items in the list
Date: 2022-10-28 05:31:45
Message-ID: CAApHDvpKB928sDL-akN6aPpPA60gmiOCOAFYdeVk21Vom1zqiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for having a look at this.

On Thu, 27 Oct 2022 at 19:32, Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> Some comments on the patch:
> 1. I think it's better to just return dlist_is_empty(&head->dlist) &&
> (head->count == 0); from dclist_is_empty() and remove the assert for
> better readability and safety against count being zero.

I don't think that's a good change. For 1) it adds unnecessary
overhead due to the redundant checks and 2) it removes the Assert
which is our early warning that the dclist's count is getting out of
sync somewhere.

> 2. Missing dlist_is_memberof() in dclist_delete_from()?

I put that in dlist_delete_from() which is called from dclist_delete_from().

> 3. Just thinking if we need to move dlist_is_memberof() check from
> dclist_* functions to dlist_* functions, because they also need such
> insurance against callers passing spurious nodes.

I think the affected functions there would be; dlist_move_head(),
dlist_move_tail(), dlist_has_next(), dlist_has_prev(),
dlist_next_node() and dlist_prev_node(). I believe if we did that then
it's effectively an API change. The comments only claim that it's
undefined if node is not a member of the list. It does not say 'node'
*must* be part of the list. Now, perhaps doing this would just make
it more likely that we'd find bugs in our code and extension authors
would find bugs in their code, but it does move the bar.
dlist_move_head and dlist_move_tail look like they'd work perfectly
well to remove an item from 1 list and put it on the head or tail of
some completely different list. Should we really be changing that in a
patch that is meant to just add the dclist type?

> 4. More opportunities to use dclist_* in below places, no?
> dlist_push_tail(&src->mappings, &pmap->node);
> src->num_mappings++;
>
> dlist_push_head(&MXactCache, &entry->node);
> if (MXactCacheMembers++ >= MAX_CACHE_ENTRIES)

Thanks for finding those. I've adjusted them both to use dclists.

> 5. dlist_is_memberof() - do we need this at all? We trust the callers
> of dlist_* today that the passed in node belongs to the list, no?

hmm, this seems to contradict your #3?

If you look at something like dlist_move_head(), if someone calls that
and passes a 'node' that does not belong to 'head' then the result of
that is that we delete 'node' from whichever dlist that it's on and
push it onto 'head'. Nothing bad happens there. If we do the same on
a dclist then the count gets out of sync. That's bad as it could lead
to assert failures and bugs.

> 6. If we decide to have dlist_is_memberof() and the asserts around it,
> can we design it on the similar lines as dlist_check() to avoid many
> #ifdef ILIST_DEBUG #endif blocks spreading around the code?

OK, that likely is a better idea. I've done this in the attached by
way of dlist_member_check()

> 7. Do we need Assert(head->count > 0); in more places like dclist_delete_from()?

I guess it does no harm. I've added some additional ones in the attached.

> 8. Don't we need dlist_container(), dlist_head_element(),
> dlist_tail_element() for dclist_*? Even though, we might not use them
> immediately, just for the sake for completeness of dclist data
> structure.

OK, I think I'd left those because dclist_container() would just be
the same as dlist_container(), but that's not the case for the other
two, so I've added all 3.

One additional change is that I also ended up removing the use of
dclist that I had in the previous patch for ReorderBufferTXN.subtxns.
Looking more closely at the code in ReorderBufferAssignChild():

/*
* We already saw this transaction, but initially added it to the
* list of top-level txns. Now that we know it's not top-level,
* remove it from there.
*/
dlist_delete(&subtxn->node);

The problem is that since ReorderBufferTXN is used for both
transactions and sub-transactions that it's not easy to determine if
the ReorderBufferTXN.node is part of the ReorderBuffer.toplevel_by_lsn
dlist or the ReorderBufferTXN.subtxns. It seems safer just to leave
this one alone.

David

Attachment Content-Type Size
v2_add_doubly_linked_count_lists.patch application/x-patch 30.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2022-10-28 05:53:23 Re: [PROPOSAL] : Use of ORDER BY clause in insert.sql
Previous Message Tom Lane 2022-10-28 05:13:27 Re: [PROPOSAL] : Use of ORDER BY clause in insert.sql