Re: Add min and max execute statement time in pg_stat_statement

From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Mitsumasa KONDO <kondo(dot)mitsumasa(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add min and max execute statement time in pg_stat_statement
Date: 2014-01-30 04:48:10
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/01/29 16:58), Peter Geoghegan wrote:
> On Tue, Jan 28, 2014 at 10:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> writes:
>>> By the way, latest pg_stat_statement might affect performance in Windows system.
>>> Because it uses fflush() system call every creating new entry in
>>> pg_stat_statements, and it calls many fread() to warm file cache.
>> This statement doesn't seem to have much to do with the patch as
>> committed.
> You could make a strong case for the patch improving performance in
> practice for many users, by allowing us to increase the
> pg_stat_statements.max default value to 5,000, 5 times the previous
> value. The real expense of creating a new entry is the exclusive
> locking of the hash table, which blocks *everything* in
> pg_stat_statements. That isn't expanded at all, since queries are only
> written with a shared lock, which only blocks the creation of new
> entries which was already relatively expensive. In particular, it does
> not block the maintenance of costs for all already observed entries in
> the hash table. It's obvious that simply reducing the pressure on the
> cache will improve matters a lot, which for many users the external
> texts patch does.
> Since Mitsumasa has compared that patch and at least one of his
> proposed pg_stat_statements patches on several occasions, I will
> re-iterate the difference: any increased overhead from the external
> texts patch is paid only when a query is first entered into the hash
> table. Then, there is obviously and self-evidently no additional
> overhead from contention for any future execution of the same query,
> no matter what, indefinitely (the exclusive locking section of
> creating a new entry does not do I/O, except in fantastically unlikely
> circumstances). So for many of the busy production systems I work
> with, that's the difference between a cost paid perhaps tens of
> thousands of times a second, and a cost paid every few days or weeks.
> Even if he is right about a write() taking an unreasonably large
> amount of time on Windows, the consequences felt as pg_stat_statements
> LWLock contention would be very limited.
> I am not opposed in principle to adding new things to the counters
> struct in pg_stat_statements. I just think that the fact that the
> overhead of installing the module on a busy production system is
> currently so low is of *major* value, and therefore any person that
> proposes to expand that struct should be required to very conclusively
> demonstrate that there is no appreciably increase in overhead. Having
> a standard deviation column would be nice, but it's still not that
> important. Maybe when we have portable atomic addition we can afford
> to be much more inclusive of that kind of thing.

I'd like to know the truth and the fact in your patch, rather than your argument:-)

So I create detail pg_stat_statements benchmark tool using pgbench.
This tool can create 10000 pattern unique sqls in a file, and it is only
for measuring pg_stat_statements performance. Because it only updates
pg_stat_statements data and doesn't write to disk at all. It's fair benchmark.

perl >file.sql
psql -h -h mitsu-ko -c 'SELECT pg_stat_statements_reset()'
pgbench -h mitsu-ko -c64 -j32 -n -T 180 -f file.sql

[part of sample sqls file, they are executed in pgbench]

I test some methods that include old pgss, old pgss with my patch, and new pgss.
Test server and postgresql.conf are same in I tested last day in this ML-thread.
And test methods and test results are here,

method 1: with old pgss, pg_stat_statements.max=1000(default)
method 2: with old pgss with my patch, pg_stat_statements.max=1000(default)
peter 1 : with new pgss(Peter's patch), pg_stat_statements.max=5000(default)
peter 2 : with new pgss(Peter's patch), pg_stat_statements.max=1000

[for reference]
method 5:with old pgss, pg_stat_statements.max=5000
method 6:with old pgss with my patch, pg_stat_statements.max=5000

method | try1 | try2 | try3 | degrade performance ratio
method 1 | 6.546 | 6.558 | 6.638 | 0% (reference score)
method 2 | 6.527 | 6.556 | 6.574 | 1%
peter 1 | 5.204 | 5.203 | 5.216 |20%
peter 2 | 4.241 | 4.236 | 4.262 |35%

method 5 | 5.931 | 5.848 | 5.872 |11%
method 6 | 5.794 | 5.792 | 5.776 |12%

New pgss is extremely degrade performance than old pgss, and I cannot see
performance scaling.
I understand that his argument was "My patch reduces time of LWLock in
it is good for performance. However, Kondo's patch will be degrade performance in
pg_stat_statements. so I give it -1.". But unless seeing this test results, his
claim seems
to contradict it even in Linux OS.

I think that his patch's strong point is only storing long query, at the sacrifice of
significant performance. Does it hope in community? At least, I don't hope it.

Mitsumasa KONDO
NTT Open Source Software Center

Attachment Content-Type Size text/plain 404 bytes

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message KONDO Mitsumasa 2014-01-30 04:48:47 Re: Add min and max execute statement time in pg_stat_statement
Previous Message Bruce Momjian 2014-01-30 04:21:06 Re: updated emacs configuration