Re: Improvements in pg_dump/pg_restore toc format and performances

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Pierre Ducroquet <p(dot)psql(at)pinaraf(dot)info>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Improvements in pg_dump/pg_restore toc format and performances
Date: 2023-10-03 09:47:57
Message-ID: CALDaNm0eaZ8Ao28QHdFTm+wRK3V5oGuti20u89+cmmkX_g1pBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 19 Sept 2023 at 15:46, Pierre Ducroquet <p(dot)psql(at)pinaraf(dot)info> wrote:
>
> On Monday, September 18, 2023 11:54:42 PM CEST Nathan Bossart wrote:
> > On Mon, Sep 18, 2023 at 02:52:47PM -0700, Nathan Bossart wrote:
> > > On Thu, Jul 27, 2023 at 10:51:11AM +0200, Pierre Ducroquet wrote:
> > >> I ended up writing several patches that shaved some time for pg_restore
> > >> -l,
> > >> and reduced the toc.dat size.
> > >
> > > I've only just started taking a look at these patches, and I intend to do
> > > a
> > > more thorough review in the hopefully-not-too-distant future.
> >
> > Since cfbot is failing on some pg_upgrade and pg_dump tests, I've set this
> > to waiting-on-author.
>
> Attached updated patches fix this regression, I'm sorry I missed that.

Few comments:
1) These printf statements are not required:
+ /* write the list of am */
+ printf("%d tableams to save\n", tableam_count);
+ WriteInt(AH, tableam_count);
+ for (i = 0 ; i < tableam_count ; i++)
+ {
+ printf("%d is %s\n", i, tableams[i]);
+ WriteStr(AH, tableams[i]);
+ }
+
+ /* write the list of namespaces */
+ printf("%d namespaces to save\n", namespace_count);
+ WriteInt(AH, namespace_count);
+ for (i = 0 ; i < namespace_count ; i++)
+ {
+ printf("%d is %s\n", i, namespaces[i]);
+ WriteStr(AH, namespaces[i]);
+ }
+
+ /* write the list of owners */
+ printf("%d owners to save\n", owner_count);

2) We generally use pg_malloc in client tools, we should change palloc
to pg_malloc:
+ /* prepare dynamic arrays */
+ tableams = palloc(sizeof(char*) * 1);
+ tableams[0] = NULL;
+ tableam_count = 0;
+ namespaces = palloc(sizeof(char*) * 1);
+ namespaces[0] = NULL;
+ namespace_count = 0;
+ owners = palloc(sizeof(char*) * 1);
+ owners[0] = NULL;
+ owner_count = 0;
+ tablespaces = palloc(sizeof(char*) * 1);

3) This similar code is repeated few times, will it be possible to do
it in a common function:
+ if (namespace_count < 128)
+ {
+ te->namespace_idx = AH->ReadBytePtr(AH);
+ invalid_entry = 255;
+ }
+ else
+ {
+ te->namespace_idx = ReadInt(AH);
+ invalid_entry = -1;
+ }
+ if (te->namespace_idx == invalid_entry)
+ te->namespace = "";
+ else
+ te->namespace = namespaces[te->namespace_idx];

4) Can the initialization of tableam_count, namespace_count,
owner_count and tablespace_count be done at declaration and these
initialization code can be removed:
+ /* prepare dynamic arrays */
+ tableams = palloc(sizeof(char*) * 1);
+ tableams[0] = NULL;
+ tableam_count = 0;
+ namespaces = palloc(sizeof(char*) * 1);
+ namespaces[0] = NULL;
+ namespace_count = 0;
+ owners = palloc(sizeof(char*) * 1);
+ owners[0] = NULL;
+ owner_count = 0;
+ tablespaces = palloc(sizeof(char*) * 1);
+ tablespaces[0] = NULL;
+ tablespace_count = 0;

4) There are some whitespace issues in the patch:
Applying: move static strings to arrays at beginning
.git/rebase-apply/patch:24: trailing whitespace.
.git/rebase-apply/patch:128: trailing whitespace.
.git/rebase-apply/patch:226: trailing whitespace.
.git/rebase-apply/patch:232: trailing whitespace.
warning: 4 lines add whitespace errors.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-10-03 10:09:35 Re: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Peter Smith 2023-10-03 09:40:52 Re: [PGDOCS] Inconsistent linkends to "monitoring" views.