Quick Links

Re: Improve eviction algorithm in ReorderBuffer

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Peter Smith <smithpb2250(at)gmail(dot)com>
Cc:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Improve eviction algorithm in ReorderBuffer
Date:	2024-03-07 03:16:08
Message-ID:	CAD21AoA0uSave-4A=kniPPDZK_2TmL0WDOvfC-KzNM4D2Yv=ew@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Mar 5, 2024 at 3:28 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi, here are some review comments for v7-0002
>
> ======
> Commit Message
>
> 1.
> This commit adds a hash table to binaryheap in order to track of
> positions of each nodes in the binaryheap. That way, by using newly
> added functions such as binaryheap_update_up() etc., both updating a
> key and removing a node can be done in O(1) on an average and O(log n)
> in worst case. This is known as the indexed binary heap. The caller
> can specify to use the indexed binaryheap by passing indexed = true.
>
> ~
>
> /to track of positions of each nodes/to track the position of each node/
>
> ~~~
>
> 2.
> There is no user of it but it will be used by a upcoming patch.
>
> ~
>
> The current code does not use the new indexing logic, but it will be
> used by an upcoming patch.

Fixed.

>
> ======
> src/common/binaryheap.c
>
> 3.
> +/*
> + * Define parameters for hash table code generation. The interface is *also*"
> + * declared in binaryheaph.h (to generate the types, which are externally
> + * visible).
> + */
>
> Typo: *also*"

Fixed.

>
> ~~~
>
> 4.
> +#define SH_PREFIX bh_nodeidx
> +#define SH_ELEMENT_TYPE bh_nodeidx_entry
> +#define SH_KEY_TYPE bh_node_type
> +#define SH_KEY key
> +#define SH_HASH_KEY(tb, key) \
> + hash_bytes((const unsigned char *) &key, sizeof(bh_node_type))
> +#define SH_EQUAL(tb, a, b) (memcmp(&a, &b, sizeof(bh_node_type)) == 0)
> +#define SH_SCOPE extern
> +#ifdef FRONTEND
> +#define SH_RAW_ALLOCATOR pg_malloc0
> +#endif
> +#define SH_DEFINE
> +#include "lib/simplehash.h"
>
> 4a.
> The comment in simplehash.h says
> * The following parameters are only relevant when SH_DEFINE is defined:
> * - SH_KEY - ...
> * - SH_EQUAL(table, a, b) - ...
> * - SH_HASH_KEY(table, key) - ...
> * - SH_STORE_HASH - ...
> * - SH_GET_HASH(tb, a) - ...
>
> So maybe it is nicer to reorder the #defines in that same order?
>
> SUGGESTION:
> +#define SH_PREFIX bh_nodeidx
> +#define SH_ELEMENT_TYPE bh_nodeidx_entry
> +#define SH_KEY_TYPE bh_node_type
> +#define SH_SCOPE extern
> +#ifdef FRONTEND
> +#define SH_RAW_ALLOCATOR pg_malloc0
> +#endif
>
> +#define SH_DEFINE
> +#define SH_KEY key
> +#define SH_EQUAL(tb, a, b) (memcmp(&a, &b, sizeof(bh_node_type)) == 0)
> +#define SH_HASH_KEY(tb, key) \
> + hash_bytes((const unsigned char *) &key, sizeof(bh_node_type))
> +#include "lib/simplehash.h"

I'm really not sure it helps increase readability. For instance, for
me it's readable if SH_DEFINE and SH_DECLARE come to the last before
#include since it's more obvious whether we want to declare, define or
both. Other simplehash.h users also do so.

>
> ~~
>
> 4b.
> The comment in simplehash.h says that "it's preferable, if possible,
> to store the element's hash in the element's data type", so should
> SH_STORE_HASH and SH_GET_HASH also be defined here?

Good catch. I've used these macros.

>
> ~~~
>
> 5.
> + *
> + * If 'indexed' is true, we create a hash table to track of each node's
> + * index in the heap, enabling to perform some operations such as removing
> + * the node from the heap.
> */
> binaryheap *
> -binaryheap_allocate(int capacity, binaryheap_comparator compare, void *arg)
> +binaryheap_allocate(int capacity, binaryheap_comparator compare,
> + bool indexed, void *arg)
>
> BEFORE
> ... enabling to perform some operations such as removing the node from the heap.
>
> SUGGESTION
> ... to help make operations such as removing nodes more efficient.
>

But these operations literally require the indexed binary heap as we
have an assertion:

void
binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d)
{
bh_nodeidx_entry *ent;

Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
Assert(heap->bh_indexed);

> ~~~
>
> 6.
> + heap->bh_indexed = indexed;
> + if (heap->bh_indexed)
> + {
> +#ifdef FRONTEND
> + heap->bh_nodeidx = bh_nodeidx_create(capacity, NULL);
> +#else
> + heap->bh_nodeidx = bh_nodeidx_create(CurrentMemoryContext, capacity,
> + NULL);
> +#endif
> + }
> +
>
> The heap allocation just uses palloc instead of palloc0 so it might be
> better to assign "heap->bh_nodeidx = NULL;" up-front, just so you will
> never get a situation where bh_indexed is false but bh_nodeidx has
> some (garbage) value.

Fixed.

>
> ~~~
>
> 7.
> +/*
> + * Set the given node at the 'index', and updates its position accordingly.
> + *
> + * Return true if the node's index is already tracked.
> + */
> +static bool
> +bh_set_node(binaryheap *heap, bh_node_type node, int index)
>
> 7a.
> I felt the 1st sentence should be more like:
>
> SUGGESTION
> Set the given node at the 'index' and track it if required.

Fixed.

>
> ~
>
> 7b.
> IMO the parameters would be better the other way around (e.g. 'index'
> before the 'node') because that's what the assignments look like:
>
>
> heap->bh_nodes[heap->bh_size] = d;
>
> becomes:
> bh_set_node(heap, heap->bh_size, d);
>

I think it assumes heap->bh_nodes is an array. But if we change it in
the future, it will no longer make sense. I think it would make more
sense if we define the parameters in an order like "we set the 'node'
at 'index'". What do you think?

> ~~~
>
> 8.
> +static bool
> +bh_set_node(binaryheap *heap, bh_node_type node, int index)
> +{
> + bh_nodeidx_entry *ent;
> + bool found = false;
> +
> + /* Set the node to the nodes array */
> + heap->bh_nodes[index] = node;
> +
> + if (heap->bh_indexed)
> + {
> + /* Remember its index in the nodes array */
> + ent = bh_nodeidx_insert(heap->bh_nodeidx, node, &found);
> + ent->idx = index;
> + }
> +
> + return found;
> +}
>
> 8a.
> That 'ent' declaration can be moved to the inner block scope, so it is
> closer to where it is needed.
>
> ~
>
> 8b.
> + /* Remember its index in the nodes array */
>
> The comment is worded a bit ambiguously. IMO a simpler comment would
> be: "/* Keep track of the node index. */"
>
> ~~~

Fixed.

>
> 9.
> +static void
> +bh_delete_nodeidx(binaryheap *heap, bh_node_type node)
> +{
> + if (!heap->bh_indexed)
> + return;
> +
> + (void) bh_nodeidx_delete(heap->bh_nodeidx, node);
> +}
>
> Since there is only 1 statement IMO it is simpler to write this
> function like below:
>
> if (heap->bh_indexed)
> (void) bh_nodeidx_delete(heap->bh_nodeidx, node);

Fixed.

>
> ~~~
>
> 10.
> +/*
> + * Replace the node at 'idx' with the given node 'replaced_by'. Also
> + * update their positions accordingly.
> + */
> +static void
> +bh_replace_node(binaryheap *heap, int idx, bh_node_type replaced_by)
>
> 10a.
> Would 'node' or 'new_node' or 'replacement' be a better name than 'replaced_by'?

Fixed.

>
> ~
>
> 10b.
> I noticed that the index param is called 'idx' here but in other
> functions, it is called 'index'. I think either is good (I prefer
> 'idx') but at least everywhere should use the same name for
> consistency.

Fixed.

>
> ~~~
>
> 11.
> +static void
> +bh_replace_node(binaryheap *heap, int idx, bh_node_type replaced_by)
> +{
> + /* Remove overwritten node's index */
> + bh_delete_nodeidx(heap, heap->bh_nodes[idx]);
> +
> + /* Replace it with the given new node */
> + if (idx < heap->bh_size)
> + {
> + bool found PG_USED_FOR_ASSERTS_ONLY;
> +
> + found = bh_set_node(heap, replaced_by, idx);
> +
> + /* The overwritten node's index must already be tracked */
> + Assert(!heap->bh_indexed || found);
> + }
> +}
>
> I did not understand the condition.
> e.g. Can you explain when is idx NOT less than heap->bh_size?
> e.g. If this condition failed then nothing gets replaced (??)

It was for a case like where we call binaryheap_remote_node(heap, 0)
where the heap has only one entry, resulting in setting the root node
again. I updated the bh_replace_node() to return if the node doesn't
not need to be moved.

>
> ~~~
>
> ======
> src/include/lib/binaryheap.h
>
> 12.
> +/*
> + * Struct for A hash table element to store the node's index in the bh_nodes
> + * array.
> + */
> +typedef struct bh_nodeidx_entry
>
> /for A hash table/for a hash table/
>
> ~~~
>
> 13.
> +/* define parameters necessary to generate the hash table interface */
>
> Suggest uppercase "Define" and add a period.

Fixed.

>
> ~~~
>
> 14.
> +
> + /*
> + * If bh_indexed is true, the bh_nodeidx is used to track of each node's
> + * index in bh_nodes. This enables the caller to perform
> + * binaryheap_remove_node_ptr(), binaryheap_update_up/down in O(log n).
> + */
> + bool bh_indexed;
> + bh_nodeidx_hash *bh_nodeidx;
> } binaryheap;
>
> I'm wondering why the separate 'bh_indexed' is necessary at all. Can't
> you just use the bh_nodeidx value? E.g. If bh_nodeidx == NULL then it
> means there is no index tracking, otherwise there is.
>

Good point. I added a macro binaryheap_indexed() to check it for
better readability.

The above comments are incorporated into the latest v8 patch set that
I've just submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoBYjJmz7q_%3DZ%2BeXJgm0FScyu3_iGFshPAvnq78B2KL3qQ%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Re: Improve eviction algorithm in ReorderBuffer at 2024-03-05 06:28:00 from Peter Smith

Responses

Re: Improve eviction algorithm in ReorderBuffer at 2024-03-08 03:58:07 from Peter Smith

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Masahiko Sawada	2024-03-07 03:19:11	Re: Synchronizing slots from primary to standby
Previous Message	Masahiko Sawada	2024-03-07 03:14:26	Re: Improve eviction algorithm in ReorderBuffer