Re: PATCH: pgbench - merging transaction logs

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: pgbench - merging transaction logs
Date: 2015-03-15 22:55:52
Message-ID: 55060DF8.9050600@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.3.2015 20:35, Fabien COELHO wrote:
>
>> Firstly, the fact that pgbench produces one file per thread is awkward.
>
> I agree, but I think it is due to the multi process thread emulation: if
> you have real threads, you can do a simple fprintf, possibly with some
> mutex, and you're done. There is really nothing to do to implement this
> feature.

My fear was that this either requires explicit locking, or some implicit
locking - for example while fprintf() in glibc is thread-safe, I really
am not sure about Win32 for example. This might influence the results
for very short transactions - I haven't tried that, though, so this
might be a false assumption.

The way I implemented the merge is immute to this.

>> Secondly, a separate tool (even if provided with pgbench) would
>> require passing detailed info about the log format - whether it's
>> aggregated or not, throttling or not, ....
>
> Hmmm.
>
>>> It could belong to pgbench if pgbench was *generating* the merged
>>> log file directly. [...]
>>
>> I considered this approach, i.e. adding another 'collector' thread
>> receiving results from all the other threads, but decided not to
>> do that. That would require a fair amount of inter-process
>> communication, locking etc. and might affect the measurements
>
> I agree that inter-process stuff should be avoided. This is not what
> I had in mind. I was thinking of "fprintf" on the same file handler
> by different threads.

That still involves some sort of 'implicit' locking, no? And as I
mentioned, I'm not sure fprintf() is thread-safe on all the platforms we
support.

>>> Another issue raised by your patch is that the log format may be
>>> improved, say by providing a one-field timestamp at the beginning
>>> of the line.
>>
>> I don't see how this is relevant to this patch?
>
> For the simple log format, all the parsing needed would be for the
> timestamp, and the remainder would just be text to pass along, no
> need to %d %f... whatever.

Oh, ok. Well, that's true, but I don't think that significantly changes
the overall complexity.

>>> The current IO complexity is in p²n where it should be simply pn...
>>
>> Maybe. Implementing a 2-way merge sort was the simplest solution at
>> the moment, and I don't think this really matters because the I/O
>> generated by the benchmark itself is usually much higher than this.
>
> If you do not do a n-way merge, you could do a 2-way merge on a
> binary tree so that the IO complexity would be p.log(p).n (I think),
> and not p²n.

Yes, I could do that.

I still think this probably an overengineering, but not a big deal. I'll
leave this for later, though (this patch is in 2015-06 CF anyway).

>>> For aggregates, some data in the output may be special values
>>> "NaN/-/...", [...]
>> You mean the aggregated log? I can't think of a way to get there such
>> values - can you provide a plausible scenario how that could happen?
>
> Possibly I'm wrong. Ok, I tried it:
>
> sh> ./pgbench --rate=0.5 --aggregate-interval=1 --log
> sh> cat pgbench_log.12671
> 1426445236 1 5034 25341156 5034 5034 687 471969 687 687
> 1426445237 0 0 0 0 0 0 0 0 0
> 1426445238 0 0 0 0 0 0 0 0 0
> 1426445239 1 8708 75829264 8708 8708 2063 4255969 2063 2063
> ...
>
> Good news, I could not generate strange values. Just when there are
> 0 transactions possibly some care should be taken when combining
> values.

Yeah, I wasn't able to come up with such scenario, but wasn't sure.

>
>>> It seems that system function calls are not tested for errors.
>>
>> That's true, but that's how the rest of pgbench code is written.
>
> Hmmm.... This is not entirely true, there are *some* checks if you look
> carefully:-)
>
> if ((fd = fopen(filename, "r")) == NULL) ...
> if ((fp = popen(command, "r")) == NULL) ...
> nsocks = select(...)
> if (nsocks < 0) ...

OK, there are a few checks ;-) But none of the fprintf calls IIRC.

Anyway, I plan to refactor this part of the patch to get rid of the
copy'n'paste pieces, so I'll take care of this too.

>>> (b) if we could get rid of the "thread emulation", pgbench could
>>> generate the merged logs directly and simply, and the option could
>>> be provided then.
>>
>> That however is not the goal of this patch.
>
> Sure. My point is that the amount of code you write to implement this
> merge stuff is due to this feature. Without it, the patch would
> probably need 10 lines of code. Moreover, the way it is implement
> requires scanning and reprinting, which means more work in many
> places to update the format later.

Possibly. But it's not written like that :-(

>> The thread emulation is there for a reason,
>
> My opinion is that it *was* there for a reason. Whether it makes
> sense today to still have it, maintain it, and have to write such
> heavy code for a trivial feature just because of it is another
> matter.

Possibly. I can't really decide that.

>> and I certainly am not going to work on eliminating it
>> (not sure that's even possible).
>
> I wish it will be:-)
>
> I would suggest this that I would support: implement this feature the
> simple way (aka fprintf, maybe a mutex) when compiled with threads, and
> generate an error "feature not available with process-based thread
> emulation" when compiled with processes. This way we avoid a, lot of
> heavy code to maintain in the future, and you still get the feature
> within pgbench. There are already some things which are not the same
> with thread emulation because it would have been tiring to implement for
> it for very little benefit.

I really dislike the features supported only in some cases (although
most deployments probably support proper threading these days).

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2015-03-15 23:37:08 Re: Using 128-bit integers for sum, avg and statistics aggregates
Previous Message Andres Freund 2015-03-15 22:38:23 Re: recovery_target_action = pause & hot_standby = off