Re: Inefficiency in parallel pg_restore with many tables

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Inefficiency in parallel pg_restore with many tables
Date: 2023-09-10 16:35:10
Message-ID: 2393313.1694363710@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
> I spent some more time on this patch and made the relevant adjustments to
> the rest of the set.

Hmm ... I do not like v7 very much at all. It requires rather ugly
changes to all of the existing callers, and what are we actually
buying? If anything, it makes things slower for pass-by-value items
like integers. I'd stick with the Datum convention in the backend.

Instead, I took a closer look through the v6 patch set.
I think that's in pretty good shape and nearly committable,
but I have a few thoughts:

* I'm not sure about defining bh_node_type as a macro:

+#ifdef FRONTEND
+#define bh_node_type void *
+#else
+#define bh_node_type Datum
+#endif

rather than an actual typedef:

+#ifdef FRONTEND
+typedef void *bh_node_type;
+#else
+typedef Datum bh_node_type;
+#endif

My concern here is that bh_node_type is effectively acting as a
typedef, so that pgindent might misbehave if it's not declared as a
typedef. On the other hand, there doesn't seem to be any indentation
problem in the patchset as it stands, and we don't expect any code
outside binaryheap.h/.c to refer to bh_node_type, so maybe it's fine.
(If you do choose to make it a typedef, remember to add it to
typedefs.list.)

* As a matter of style, I'd recommend adding braces in places
like this:

if (heap->bh_size >= heap->bh_space)
+ {
+#ifdef FRONTEND
+ pg_fatal("out of binary heap slots");
+#else
elog(ERROR, "out of binary heap slots");
+#endif
+ }
heap->bh_nodes[heap->bh_size] = d;

It's not wrong as you have it, but I think it's more readable
and less easy to accidentally break with the extra braces.

* In 0002, isn't the comment for binaryheap_remove_node wrong?

+ * Removes the nth node from the heap. The caller must ensure that there are
+ * at least (n - 1) nodes in the heap. O(log n) worst case.

Shouldn't that be "(n + 1)"? Also, I'd specify "n'th (zero based) node"
for clarity.

* I would say that this bit in 0004:

- j = removeHeapElement(pendingHeap, heapLength--);
+ j = (intptr_t) binaryheap_remove_first(pendingHeap);

needs an explicit cast to int:

+ j = (int) (intptr_t) binaryheap_remove_first(pendingHeap);

otherwise some compilers might complain about the result possibly
not fitting in "j".

Other than those nitpicks, I like v6. I'll mark this RfC.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2023-09-10 20:59:01 Re: proposal: psql: show current user in prompt
Previous Message Alexander Lakhin 2023-09-10 10:00:00 Re: Cleaning up array_in()