Re: pg_dump multi VALUES INSERT

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Surafel Temesgen <surafel3000(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump multi VALUES INSERT
Date: 2019-01-23 15:44:58
Message-ID: alpine.DEB.2.21.1901231633430.16643@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello David,

> I thought about this and looked into it, but I decided it didn't look
> like a smart thing to do. The reason is that if --inserts sets
> dump_inserts to 1 then --rows-per-insert sets it to something else
> that's likely fine, but if that happens in the opposite order then the
> --rows-per-insert gets overwritten with 1.

You can test before doing that!

case X:
if (opt.dump_inserts == 0)
opt.dump_inserts = 1;
// otherwise option is already set

> The bad news is the order that happens is defined by the order of the
> command line args.

> It might be possible to make it work by having --inserts set some other
> variable,

ISTM that it is enough to test whether the variable is zero.

> then set dump_inserts to 1 if it's set to 0 and the other variable is
> set to >= 1... but that requires another variable, which is what you
> want to avoid...

I still do not understand the need for another variable.

int ninserts = 0; // default is to use copy
while (getopt...)
{
switch (...) {
case "--inserts":
if (ninserts == 0) ninserts = 1;
break;
case "--rows-per-insert":
ninserts = arg_value;
checks...
break;
...

> I think it's best to have a variable per argument.

I disagree, because it adds complexity where none is needed: here the new
option is an extension of a previous one, thus the previous one just
becomes a particular case, so it seems simpler to manage it as the
particular case it is rather than a special case, creating the need for
checking the consistency and so if two variables are used.

> I could get rid of the got_rows_per_insert variable, but it would
> require setting the default value for rows_per_insert in the main()
> function rather than in InitDumpOptions(). I thought
> InitDumpOptions() looked like just the place to do this, so went with
> that option. To make it work without got_rows_per_insert,
> rows_per_insert would have to be 0 by default and we'd know we saw a
> --rows-per-insert command line arg by the fact that rows_per_insert
> was non-zero. Would you rather have it that way?

Yep, esp as rows_per_insert & dump_inserts could be the same.

>> The feature is not tested anywhere. I still think that there should be a
>> test on empty/small/larger-than-rows-per-insert tables, possibly added to
>> existing TAP-tests.
>
> I was hoping to get away with not having to do that... mainly because
> I've no idea how.

Hmmm. That is another question! Maybe someone will help.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2019-01-23 15:48:28 Re: WIP: Avoid creation of the free space map for small tables
Previous Message Dmitry Dolgov 2019-01-23 15:12:15 Re: ArchiveEntry optional arguments refactoring