Re: pgbench -f and vacuum

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, Tomáš Vondra <tv(at)fuzzy(dot)cz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench -f and vacuum
Date: 2015-02-11 19:00:46
Message-ID: CAMkU=1ySmOoUsmmLziE6nQPEYo6uKgvhYxwwwhBN+VBuS+Hc4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 23, 2014 at 7:42 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Robert Haas wrote:
> > On Mon, Dec 22, 2014 at 6:55 PM, Alvaro Herrera
> > <alvherre(at)2ndquadrant(dot)com> wrote:
> > > Here's a completely different idea. How about we add an option that
> > > means "vacuum this table before running the test" (can be given several
> > > times); by default the set of vacuumed tables is the current pgbench_*
> > > list, but if -f is specified then the default set is cleared. So if
> you
> > > have a -f script and want to vacuum the default tables, you're forced
> to
> > > give a few --vacuum-table=foo options. But this gives the option to
> > > vacuum some other table before the test, not just the pgbench default
> > > ones.
> >
> > Well, really, you might want arbitrary initialization steps, not just
> > vacuums. We could have --init-script=FILENAME.
>
> "Init" (pgbench -i) is the step that creates the tables and populates
> them, so I think this would need a different name, maybe "startup," but
> otherwise yeah.
>
> > Although that might be taking this thread rather far off-topic.
>
> Not really sure about that, because the only outstanding objection to
> this discussion is what happens in the startup stage if you specify -f.
> Right now vacuum is attempted on the standard tables, which is probably
> not the right thing in the vast majority of cases. But if we turn that
> off, how do we reinstate it for the rare cases that want it? Personally
> I would just leave it turned off and be done with it, but if we want to
> provide some way to re-enable it, this --startup-script=FILE gadget
> sounds like a pretty decent idea.
>

There are two (or more?) possible meanings of a startup script. One would
be run a single time at the start of a pgbench run, and one would be run at
the start of each connection, in the case of -C or -c. Vacuums would
presumably go in the first category, while something like tweaking a
work_mem or enable_* setting would use the second. I'd find more use for
the second way.

I had a patch to do this on a per connection basis a while ago, but it took
the command as a string to --startup. Robert suggested it be a filename
rather than a string, and I agreed but never followed up with a different
patch, as I couldn't figure out how to refactor the code that parses -f
files so that it could be used for this without undo replication of the
code.

See <
http://www.postgresql.org/message-id/CAMkU=1xV3tYKoHD8U2mQzfC5Kbn_bdcVf8br-EnUvy-6Z=B47w@mail.gmail.com
>

I was wondering if we could't invent three new backslash commands.

One would precede an SQL command to be run during -i, and ignored any other
time (and then during -i any unbackslashed commands would be ignored)

One would precede an SQL command to be run upon starting up a pgbench run.

One would precede an SQL command to be run upon starting up a benchmarking
connection.

That way you could have a single file that would record its own
initialization requirements.

One problem is I don't know how you would merge together multiple -f
arguments. Another is I would want to be able to override the
per-connection command without having to use sed or something to
edit-in-place the SQL file.

But as far as what has been discussed on the central topic of this thread,
I think that doing the vacuum and making the failure for non-existent
tables be non-fatal when -f is provided would be an improvement. Or maybe
just making it non-fatal at all times--if the table is needed and not
present, the session will fail quite soon anyway. I don't see the other
changes as being improvements. I would rather just learn to add the -n
when I use -f and don't have the default tables in place, than have to
learn new methods for saying "no really, I left -n off on purpose" when I
have a custom file which does use the default tables and I want them
vacuumed.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-02-11 19:03:18 Re: parallel mode and parallel contexts
Previous Message Robert Haas 2015-02-11 18:59:04 Re: parallel mode and parallel contexts