Re: Add min and max execute statement time in pg_stat_statement

From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Mitsumasa KONDO <kondo(dot)mitsumasa(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add min and max execute statement time in pg_stat_statement
Date: 2014-01-30 04:48:10
Message-ID: 52E9D98A.4000007@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2014/01/29 16:58), Peter Geoghegan wrote:
> On Tue, Jan 28, 2014 at 10:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> writes:
>>> By the way, latest pg_stat_statement might affect performance in Windows system.
>>> Because it uses fflush() system call every creating new entry in
>>> pg_stat_statements, and it calls many fread() to warm file cache.
>>
>> This statement doesn't seem to have much to do with the patch as
>> committed.
>
> You could make a strong case for the patch improving performance in
> practice for many users, by allowing us to increase the
> pg_stat_statements.max default value to 5,000, 5 times the previous
> value. The real expense of creating a new entry is the exclusive
> locking of the hash table, which blocks *everything* in
> pg_stat_statements. That isn't expanded at all, since queries are only
> written with a shared lock, which only blocks the creation of new
> entries which was already relatively expensive. In particular, it does
> not block the maintenance of costs for all already observed entries in
> the hash table. It's obvious that simply reducing the pressure on the
> cache will improve matters a lot, which for many users the external
> texts patch does.
>
> Since Mitsumasa has compared that patch and at least one of his
> proposed pg_stat_statements patches on several occasions, I will
> re-iterate the difference: any increased overhead from the external
> texts patch is paid only when a query is first entered into the hash
> table. Then, there is obviously and self-evidently no additional
> overhead from contention for any future execution of the same query,
> no matter what, indefinitely (the exclusive locking section of
> creating a new entry does not do I/O, except in fantastically unlikely
> circumstances). So for many of the busy production systems I work
> with, that's the difference between a cost paid perhaps tens of
> thousands of times a second, and a cost paid every few days or weeks.
> Even if he is right about a write() taking an unreasonably large
> amount of time on Windows, the consequences felt as pg_stat_statements
> LWLock contention would be very limited.
>
> I am not opposed in principle to adding new things to the counters
> struct in pg_stat_statements. I just think that the fact that the
> overhead of installing the module on a busy production system is
> currently so low is of *major* value, and therefore any person that
> proposes to expand that struct should be required to very conclusively
> demonstrate that there is no appreciably increase in overhead. Having
> a standard deviation column would be nice, but it's still not that
> important. Maybe when we have portable atomic addition we can afford
> to be much more inclusive of that kind of thing.

I'd like to know the truth and the fact in your patch, rather than your argument:-)

So I create detail pg_stat_statements benchmark tool using pgbench.
This tool can create 10000 pattern unique sqls in a file, and it is only
for measuring pg_stat_statements performance. Because it only updates
pg_stat_statements data and doesn't write to disk at all. It's fair benchmark.

[usage]
perl create_sql.pl >file.sql
psql -h -h xxx.xxx.xxx.xxx mitsu-ko -c 'SELECT pg_stat_statements_reset()'
pgbench -h xxx.xxx.xxx.xxx mitsu-ko -c64 -j32 -n -T 180 -f file.sql

[part of sample sqls file, they are executed in pgbench]
SELECT
1-1/9+5*8*6+5+9-6-3-4/9%7-2%7/5-9/7+2+9/7-1%5%9/3-4/4-9/1+5+5/6/5%2+1*2*3-8/8+5-3-8-4/8+5%2*2-2-5%6+4
SELECT
1%9%7-8/5%6-1%2*2-7+9*6-2*6-9+1-2*9+6+7*8-4-2*1%3/7-1%4%9+4+7/5+4/2-3-5%8/3/7*6-1%8*6*1/7/2%5*6/2-3-9
SELECT
1%3*2/8%6%5-3%1+1/6*6*5/9-2*5+6/6*5-1/2-6%4%8/7%2*7%5%9%5/9%1%1-3-9/2*1+1*6%8/2*4/1+6*7*1+5%6+9*1-9*6
...

I test some methods that include old pgss, old pgss with my patch, and new pgss.
Test server and postgresql.conf are same in I tested last day in this ML-thread.
And test methods and test results are here,

[methods]
method 1: with old pgss, pg_stat_statements.max=1000(default)
method 2: with old pgss with my patch, pg_stat_statements.max=1000(default)
peter 1 : with new pgss(Peter's patch), pg_stat_statements.max=5000(default)
peter 2 : with new pgss(Peter's patch), pg_stat_statements.max=1000

[for reference]
method 5:with old pgss, pg_stat_statements.max=5000
method 6:with old pgss with my patch, pg_stat_statements.max=5000

[results]
method | try1 | try2 | try3 | degrade performance ratio
-----------------------------------------------------------------
method 1 | 6.546 | 6.558 | 6.638 | 0% (reference score)
method 2 | 6.527 | 6.556 | 6.574 | 1%
peter 1 | 5.204 | 5.203 | 5.216 |20%
peter 2 | 4.241 | 4.236 | 4.262 |35%

method 5 | 5.931 | 5.848 | 5.872 |11%
method 6 | 5.794 | 5.792 | 5.776 |12%

New pgss is extremely degrade performance than old pgss, and I cannot see
performance scaling.
I understand that his argument was "My patch reduces time of LWLock in
pg_stat_statements,
it is good for performance. However, Kondo's patch will be degrade performance in
pg_stat_statements. so I give it -1.". But unless seeing this test results, his
claim seems
to contradict it even in Linux OS.

I think that his patch's strong point is only storing long query, at the sacrifice of
significant performance. Does it hope in community? At least, I don't hope it.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Attachment Content-Type Size
create_sql.pl text/plain 404 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KONDO Mitsumasa 2014-01-30 04:48:47 Re: Add min and max execute statement time in pg_stat_statement
Previous Message Bruce Momjian 2014-01-30 04:21:06 Re: updated emacs configuration