Re: PATCH: pgbench - merging transaction logs

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: pgbench - merging transaction logs
Date: 2015-03-15 18:28:00
Message-ID: 5505CF30.7030308@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.3.2015 11:22, Fabien COELHO wrote:
>
> I've looked at the patch. Although I think that such a feature is
> somehow desirable... I have two issues with it: ISTM that
>
> (1) it does not belong to pgbench as such
>
> (2) even if, the implementation is not right
>
> About (1):
>
> I think that this functionnality as implemented does not belong to
> pgbench, and does really belong to an external script, which happen
> not to be readily available, which is a shame. PostgreSQL should
> probably provide such a script, or make it easy to find.

I disagree, for two main reasons.

Firstly, the fact that pgbench produces one file per thread is awkward.
It was implemented like this most likely because it was the simplest
solution after adding "-j" in 9.0, but I don't remember if I ever needed
a separate log from a single thread.

Secondly, a separate tool (even if provided with pgbench) would require
passing detailed info about the log format - whether it's aggregated or
not, throttling or not, ....

Those are the two main reasons why I think this should be implemented as
another option in pgbench.

> It could belong to pgbench if pgbench was *generating* the merged
> log file directly. However I'm not sure this can be done cleanly for
> the multi-process "thread emulation" (IN PASSING: which I think this
> should really get rid of because I do not know what system does not
> provide threads nowadays and would require to have a state of the art
> pgbench nevertheless, and this emulation significantly complexify the
> code by making things uselessly difficult and/or needed to be
> implemented twice or not be provided in some cases).

I considered this approach, i.e. adding another 'collector' thread
receiving results from all the other threads, but decided not to do
that. That would require a fair amount of inter-process communication,
locking etc. and might affect the measurements (especially for the
workloads with extremely short transactions, like read-only pgbench).

The implemented approach, i.e. merging results collected by each thread
independently, after the benchmarking actually completed, is a better
solution. And the code is actually simpler.

> Another issue raised by your patch is that the log format may be
> improved, say by providing a one-field timestamp at the beginning of
> the line.

I don't see how this is relevant to this patch?

> About (2):
>
> In the implementation you reimplement a partial merge sort by
> parsing each line of each file and merging it with the current result
> over and over. ISTM that an implementation should read all files in
> parallel and merge them in one pass. The current IO complexity is in
> p²n where it should be simply pn... do not use it with a significant
> number of threads and many lines... Ok, the generated files are
> likely to be kept in cache, but nevertheless.

Maybe. Implementing a 2-way merge sort was the simplest solution at the
moment, and I don't think this really matters because the I/O generated
by the benchmark itself is usually much higher than this.

The only case when this might make a difference is probably merging
large transaction logs (e.g. 100% sample from read-only test on small
dataset).

> Also there are plenty of copy paste when scanning for two files, and
> then reprinting in all the different formats. The same logic is
> implemented twice, once for simple and once for aggregated. This
> means that updating or extending the log format later on would
> require to modify these scans and prints in many places.

That is true, and I'll address that somehow (either moving the code to a
macro or a separate utility function).

> For aggregates, some data in the output may be special values
> "NaN/-/...", I am not sure how the implementation would behave in
> such cases. As lines that do not match are silently ignored, the
> result merge would just be non significant.... should it rather be an
> error? Try it with a low rate for instance.

You mean the aggregated log? I can't think of a way to get there such
values - can you provide a plausible scenario how that could happen?

>
> It seems that system function calls are not tested for errors.

That's true, but that's how the rest of pgbench code is written.

>
>> The other disadvantage of the external scripts is that you have to
>> pass all the info about the logs (whether the logs are aggregated,
>> whther there's throttling, etc.).
>
> I think that is another argument to make a better format, with the a
> timestamp ahead? Also, ISTM that it only needs to know whether it is
> merging aggregate or simple logs, no more, the other information can be
> infered by the number of fields on the line.

No, it's not. Even if you change the format like this, you still have no
idea whether the log is a per-transaction log (possibly with some
additional options), aggregated log.

There might be some auto-detection based on number of fields, for
example, but considering how many options influence that I consider that
really unreliable.

>
>> So integrating this into pgbench directly seems like a better approach,
>> and the attached patch implements that.
>
> You guessed that I disagree. Note that this is only my own opinion.
>
> In summary, I think that:
>
> (a) providing a clean script would be nice,
>
> (b) if we could get rid of the "thread emulation", pgbench could
> generate the merged logs directly and simply, and the option could
> be provided then.

That however is not the goal of this patch. The thread emulation is
there for a reason, and I certainly am not going to work on eliminating
it (not sure that's even possible).

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2015-03-15 19:07:07 Re: Sequence Access Method WIP
Previous Message Andres Freund 2015-03-15 18:12:30 Re: Cygwin vs "win32" in configure