Re: pgbench logging broken by time logic changes

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, david(dot)christensen(at)crunchydata(dot)com
Subject: Re: pgbench logging broken by time logic changes
Date: 2021-06-17 20:30:31
Message-ID: alpine.DEB.2.22.394.2106172044120.2941201@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Greg,

> I think the only thing you and I disagree on is that you see a "first
> issue in a corner case" where I see a process failure that is absolutely
> vital for me to improve.

Hmmm. I agree that improvements are needed, but for me there is simply a
few missing (removed) tap tests which should/could have caught these
issues, which are AFAICS limited to the untested area.

Given the speed of the process and the energy and patience needed to move
things forward, reverting means that the patch is probably dead for at
least a year, possibly an eon, and that is too bad because IMHO it was an
improvement (my eyes are watering when I see INSTR_TIME macros), so I'd
prefer a fix rather than a revert if it is possible, which in this case I
think it could be.

> Since the reality is that I might be the best positioned person

Good for you:-)

> to actually move said process forward in a meaningful long-term way, I
> have every intention of applying pressure to the area you're frustrated
> at. Crunchy has a whole parallel review team to the community one now
> focused on what our corporate and government customers need for software
> process control and procedure compliance. The primary business problem
> I'm working on now is how to include performance review in that mix.

This idea has been around for some time now. It is quite a task, and a
working and possibly extended pgbench is just one part of the overall
software, infrastructure and procedure needed to have that.

> I already know I need to re-engage with you over how I need min/max numbers
> in the aggregate logging output to accomplish some valuable goals.

I do try to review every patch submitted about pgbench. Feel free to fire!

> When I get around to that this summer, I'd really enjoy talking with you
> a bit, video call or something, about really any community topic you're
> frustrated with.

"frustrated" may be a strong word. I'm somehow annoyed, and unlikely to
ever submit many tests improvements in the future.

>> There is no problem with proposing tests, the problem is that they are
>> accepted, or if they are accepted then that they are not removed at the
>> first small issue but rather fixed, or their limitations accepted, because
>> testing time-sensitive features is not as simple as testing functional
>> features.
>
> For 2020 Crunchy gave me a sort of sabbatical year to research community
> oriented benchmarking topics. Having a self contained project in my home
> turned out to be the perfect way to spend *that* wreck of a year.

Yep.

> I made significant progress toward the idea of having a performance farm
> for PostgreSQL. On my laptop today is a 14GB database with 1s resolution
> latency traces for 663 days of pgbench time running 4 workloads across a
> small bare metal farm of various operating systems and hardware classes.

Wow.

> I can answer questions like "how long does a typical SSD take to execute
> an INSERT commit?" across my farm with SQL.

So, what is the answer? :-)

> It's at the "works for me!" stage of development, and I thought this was
> the right time in the development cycle to start sharing improvement
> ideas from my work; thus the other submissions in progress I alluded to.
>
> The logging feature is in an intermediate spot where validating it requires
> light custom tooling that compares its output against known variables like
> the system time.

Sure.

> It doesn't quite have a performance component to it.

Hmmm, if you log all transactions it can becomes the performance
bottleneck quite quickly:-)

> Since this time logic detail is a well known portability minefield, I
> thought demanding that particular test was a pretty easy sell.

The test I recalled was removed at ad51c6f. Ok, it would not have caught
the issue about timestamp (although it could have been improved to do so),
but it would have caught the trivial one about the catchup loop in
aggregate interval generating too many lines because of a forgotten
conversion to µs.

--
Fabien.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-06-17 20:36:22 Re: Centralizing protective copying of utility statements
Previous Message Mark Dilger 2021-06-17 20:18:38 Optionally automatically disable logical replication subscriptions on error