Re: Switch to multi-inserts for pg_depend

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Switch to multi-inserts for pg_depend
Date: 2020-09-01 09:53:36
Message-ID: 1513936C-3E08-4423-8A04-B1EBF9BD576F@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 14 Aug 2020, at 20:23, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> The logic to keep track number of used slots used is baroque, though -- that
> could use a lot of simplification.

What if slot_init was an integer which increments together with the loop
variable until max_slots is reached? If so, maybe it should be renamed
slot_init_count and slotCount renamed slot_stored_count to make the their use
clearer?

> On 31 Aug 2020, at 09:56, Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> It has been a couple of weeks, and I am not really sure what is the
> suggestion here. So I would like to move on with this patch set as
> the changes are straight-forward using the existing routines for
> object addresses by grouping all insert dependencies of the same type.
> Are there any objections?

I'm obviously biased but I think this patchset is a net win. There are more
things we can do in this space, but it's a good start.

> Attached is a rebased set, where I have added in indexing.h a unique
> definition for the hard limit of 64kB for the amount of data that can
> be inserted at once, based on the suggestion from Alvaro and Andres.

+#define MAX_CATALOG_INSERT_BYTES 65535
This name, and inclusion in a headerfile, implies that the definition is
somewhat generic as opposed to its actual use. Using MULTIINSERT rather than
INSERT in the name would clarify I reckon.

A few other comments:

+ /*
+ * Allocate the slots to use, but delay initialization until we know that
+ * they will be used.
+ */
I think this comment warrants a longer explanation on why they wont all be
used, or perhaps none of them (which is the real optimization win here).

+ ObjectAddresses *addrs_auto;
+ ObjectAddresses *addrs_normal;
It's not for this patch, but it seems a logical next step would be to be able
to record the DependencyType as well when collecting deps rather than having to
create multiple buckets.

+ /* Normal dependency from a domain to its collation. */
+ /* We know the default collation is pinned, so don't bother recording it */
It's just moved and not introduced in this patch, but shouldn't these two lines
be joined into a normal block comment?

cheers ./daniel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-09-01 09:57:12 Re: New statistics for tuning WAL buffer size
Previous Message Laurenz Albe 2020-09-01 09:37:38 Re: 回复: Is it possible to set end-of-data marker for COPY statement.