| From: | Andres Freund <andres(at)anarazel(dot)de> | 
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> | 
| Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se> | 
| Subject: | Re: Switch to multi-inserts for pg_depend | 
| Date: | 2020-08-11 00:32:21 | 
| Message-ID: | 20200811003221.iuawregryk6z2wjd@alap3.anarazel.de | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
On 2020-08-07 15:16:19 +0900, Michael Paquier wrote:
> From cd117fa88938c89ac953a5e3c036828337150b07 Mon Sep 17 00:00:00 2001
> From: Michael Paquier <michael(at)paquier(dot)xyz>
> Date: Fri, 7 Aug 2020 10:57:40 +0900
> Subject: [PATCH 1/2] Use multi-inserts for pg_depend
> 
> This is a follow-up of the work done in e3931d01.  This case is a bit
> different than pg_attribute and pg_shdepend: the maximum number of items
> to insert is known in advance, but there is no need to handle pinned
> dependencies.  Hence, the base allocation for slots is done based on the
> number of items and the maximum allowed with a cap at 64kB, and items
> are initialized once used to minimize the overhead of the operation.
> 
> Author: Daniel Gustafsson, Michael Paquier
> Discussion: https://postgr.es/m/XXX
> ---
>  src/backend/catalog/pg_depend.c | 95 ++++++++++++++++++++++++---------
>  1 file changed, 69 insertions(+), 26 deletions(-)
> 
> diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
> index 70baf03178..596f0c5e29 100644
> --- a/src/backend/catalog/pg_depend.c
> +++ b/src/backend/catalog/pg_depend.c
> @@ -47,6 +47,12 @@ recordDependencyOn(const ObjectAddress *depender,
>  	recordMultipleDependencies(depender, referenced, 1, behavior);
>  }
>  
> +/*
> + * Cap the maximum amount of bytes allocated for recordMultipleDependencies()
> + * slots.
> + */
> +#define MAX_PGDEPEND_INSERT_BYTES	65535
> +
Do we really want to end up with several separate defines for different
type of catalog batch inserts? That doesn't seem like a good
thing. Think there should be a single define for all catalog bulk
inserts.
>  /*
>   * Record multiple dependencies (of the same kind) for a single dependent
>   * object.  This has a little less overhead than recording each separately.
> @@ -59,10 +65,10 @@ recordMultipleDependencies(const ObjectAddress *depender,
>  {
>  	Relation	dependDesc;
>  	CatalogIndexState indstate;
> -	HeapTuple	tup;
> -	int			i;
> -	bool		nulls[Natts_pg_depend];
> -	Datum		values[Natts_pg_depend];
> +	int			slotCount, i;
> +	TupleTableSlot **slot;
> +	int			nslots, max_slots;
> +	bool		slot_init = true;
>  
>  	if (nreferenced <= 0)
>  		return;					/* nothing to do */
> @@ -76,11 +82,18 @@ recordMultipleDependencies(const ObjectAddress *depender,
>  
>  	dependDesc = table_open(DependRelationId, RowExclusiveLock);
>  
> +	/*
> +	 * Allocate the slots to use, but delay initialization until we know that
> +	 * they will be used.
> +	 */
> +	max_slots = Min(nreferenced,
> +					MAX_PGDEPEND_INSERT_BYTES / sizeof(FormData_pg_depend));
> +	slot = palloc(sizeof(TupleTableSlot *) * max_slots);
> +
>  	/* Don't open indexes unless we need to make an update */
>  	indstate = NULL;
>  
> -	memset(nulls, false, sizeof(nulls));
> -
> +	slotCount = 0;
>  	for (i = 0; i < nreferenced; i++, referenced++)
>  	{
>  		/*
> @@ -88,38 +101,68 @@ recordMultipleDependencies(const ObjectAddress *depender,
>  		 * need to record dependencies on it.  This saves lots of space in
>  		 * pg_depend, so it's worth the time taken to check.
>  		 */
> -		if (!isObjectPinned(referenced, dependDesc))
> +		if (isObjectPinned(referenced, dependDesc))
> +			continue;
> +
Hm, would it be better to first iterate over the dependencies, compute
the number of dependencies to be inserted, and then go ahead and create
the right number of slots?
> From fcc0a11e9fc94d2fedc71dd10ba2a23713225963 Mon Sep 17 00:00:00 2001
> From: Michael Paquier <michael(at)paquier(dot)xyz>
> Date: Fri, 7 Aug 2020 15:14:51 +0900
> Subject: [PATCH 2/2] Switch to multi-insert dependencies for many code paths
> 
> This makes use of the new APIs to insert dependencies in groups, instead
> of doing the operation one-by-one.
Seems several places have been modified to new APIs despite only
covering a single dependency. Perhaps worth mentioning?
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2020-08-11 00:33:22 | Re: walsender waiting_for_ping spuriously set | 
| Previous Message | Andres Freund | 2020-08-11 00:21:34 | Re: Hybrid Hash/Nested Loop joins and caching results from subplans |