Skip site navigation (1) Skip section navigation (2)

Ragged latency log data in multi-threaded pgbench

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Ragged latency log data in multi-threaded pgbench
Date: 2010-03-15 15:41:16
Message-ID: 4B9E551C.5010900@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Just noticed a problem popping up sometimes with the new multi-threaded 
pgbench.  This is on a Linux RPM build (the alpha4 set) compiled with 
'--disable-thread-safety'.  Still trying to nail down whether that's a 
requirement for this problem to appear or not.  I did most of my review 
of this feature with it turned on, and haven't been seeing this problem 
on other systems that are thread safe.  Not sure yet if that's cause and 
effect or coincidence yet.

Here's a sample invocation that produces ragged output for me on my one 
system:

pgbench -S -T 5 -c 4 -j 4 -l pgbench

The log file produced by this (pgbench_log.<pid>) is supposed to consist 
of a series of lines in the following format:

client,trans,latency,filenum,sec,usec

It looks like the switch between clients running on separate workers can 
lead to a mix of their respective lines showing up though.  Here's a 
couple of typical samples, with the bad line in the middle of each set:

1 138 178 0 1268665788 607559
1 139 182 0 1268665788 607751
1 1402 0 2491 0 1268665788 586135
2 1 264 0 1268665788 586463
2 2 192 0 1268665788 586665

1 274 160 0 1268665788 632966
1 275 178 0 1268665788 633154
1 276 184 0 126866578 178 0 1268665788 614015
2 141 190 0 1268665788 614252
2 142 169 0 1268665788 614430

2 274 178 0 1268665788 639218
2 275 175 0 1268665788 639402
2 276 169 0 126866578 171 0 1268665788 626933
0 141 185 0 1268665788 627165
0 142 202 0 1268665788 627377

Looks like sometimes a client is only getting part of its line written 
out before getting stomped on by the next one.  I think one of the 
assumptions being made about how to safely write to this log file may be 
broken by the multi-process implementation, which is what you get when 
thread-safety is not available.

Since there should only be 6 fields here, I think you can find whether a 
given log file has this problem or not like this:

cat pgbench_log.xxxxx | cut -d " " -f 7 | sort | uniq

If anything comes out of that, the latency log file has at least one bad 
line in it.

Similarly, this:

cat pgbench_log.xxxxx | cut -d " " -f 1 | sort | uniq

Should only show the client numbers; here there's some first columns 
with much bigger numbers too.

-- 
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com   www.2ndQuadrant.us


Responses

pgsql-hackers by date

Next:From: Joe ConwayDate: 2010-03-15 15:49:50
Subject: Re: walreceiver is uninterruptible on win32
Previous:From: Tom LaneDate: 2010-03-15 15:37:55
Subject: Re: Dyamic updates of NEW with pl/pgsql

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group