Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
Date: 2020-07-29 21:34:07
Message-ID: AE62A5AA-500F-405B-BBE6-FBA4A99CB17A@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 29 Jul 2020, at 10:32, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Jul 01, 2020 at 06:24:18PM +0900, Michael Paquier wrote:
>> I am not sure either, still it looks worth thinking about it.
>> Attached is a rebased version of the last patch. What this currently
>> holds is just the switch to heap_multi_insert() for the three catalogs
>> pg_attribute, pg_depend and pg_shdepend. One point that looks worth
>> debating about is to how much to cap the data inserted at once. This
>> uses 64kB for all three, with a number of slots chosen based on the
>> size of each record, similarly to what we do for COPY.
>
> I got an extra round of review done for this patch.

Thanks!

> While on it, I have done some measurements to see the difference in
> WAL produced and get an idea of the gain.

> On HEAD, with a table that has 1300 attributes, this leads to 563kB of
> WAL produced. With the patch, we get down to 505kB. That's an
> extreme case of course, but that's nice a nice gain.
>
> A second test, after creating a database from a template that has
> roughly 10k entries in pg_shdepend (10k empty tables actually), showed
> a reduction from 2158kB to 1762kB in WAL.

Extreme cases for sure, but more importantly, there should be no cases when
this would cause an increase wrt the status quo.

> Finally comes the third catalog, pg_depend, and there is one thing
> that makes me itching about this part. We do a lot of useless work
> for the allocation and destruction of the slots when there are pinned
> dependencies, and there can be a lot of them. Just by running the
> main regression tests, it is easy to see that in 0003 we still do a
> lot of calls of recordMultipleDependencies() for one single
> dependency, and that most of these are actually pinned. So we finish
> by doing a lot of slot manipulation to insert nothing at the end,
> contrary to the counterparts with pg_shdepend and pg_attribute.

Maybe it'd be worth pre-computing by a first pass which tracks pinned objects
in a bitmap; with a second pass which then knows how many and which to insert
into slots?

> In
> short, I think that for now it would be fine to commit a patch that
> does the multi-INSERT optimization for pg_attribute and pg_shdepend,
> but that we need more careful work for pg_depend.

Fair enough, let's break out pg_depend and I'll have another go at that.

cheers ./daniel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-07-29 22:29:52 Re: recovering from "found xmin ... from before relfrozenxid ..."
Previous Message Daniel Gustafsson 2020-07-29 20:27:31 Re: PATCH: Add uri percent-encoding for binary data