Re: pgbench and timestamps (bounced)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jaime Soler <jaime(dot)soler(at)gmail(dot)com>
Subject: Re: pgbench and timestamps (bounced)
Date: 2020-09-08 00:31:11
Message-ID: 1720048.1599525071@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
> [Resent on hackers for CF registration, sorry for the noise]

For the record, the original thread is at

https://www.postgresql.org/message-id/flat/CAKVUGgQaZVAUi1Ex41H4wrru%3DFU%2BMfwgjG0aM1br6st7sz31Vw%40mail.gmail.com

(I tried but failed to attach that thread to the CF entry, so we'll
have to settle for leaving a breadcrumb in this thread.)

> It requires a mutex around the commands, I tried to do some windows
> implementation which may or may not work.

Ugh, I'd really rather not do that. Even disregarding the effects
of a mutex, though, my initial idea for fixing this has a big problem:
if we postpone PREPARE of the query until first execution, then it's
happening during timed execution of the benchmark scenario and thus
distorting the timing figures. (Maybe if we'd always done it like
that, it'd be okay, but I'm quite against changing the behavior now
that it's stood for a long time.)

However, perhaps there's more than one way to fix this. Once we've
scanned all of the script and seen all the \set commands, we know
(in principle) the set of all variable names that are in use.
So maybe we could fix this by

(1) During the initial scan of the script, make variable-table
entries for every \set argument, with the values shown as undefined
for the moment. Do not try to parse SQL commands in this scan,
just collect them.

(2) Make another scan in which we identify variable references
in the SQL commands and issue PREPAREs (if enabled).

(3) Perform the timed run.

This avoids any impact of this bug fix on the semantics or timing
of the benchmark proper. I'm not sure offhand whether this
approach makes any difference for the concerns you had about
identifying/suppressing variable references inside quotes.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-09-08 00:55:32 Re: Division in dynahash.c due to HASH_FFACTOR
Previous Message Kyotaro Horiguchi 2020-09-08 00:13:53 Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation