Re: idea: log_statement_sample_rate - bottom limit for sampling

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Adrien Nayrat <adrien(dot)nayrat(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: idea: log_statement_sample_rate - bottom limit for sampling
Date: 2019-07-27 22:19:48
Message-ID: 20190727221948.irg6sfqh57dynoc7@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've started reviewing this patch, thinking that maybe I could get it
committed as it's marked as RFC. In general I agree with having this
fuature, but I think we need to rethink the GUC because the current
approach is just confusing.

The way the current patch works is that we have three GUCs:

log_min_duration_statement
log_statement_sample_limit
log_statement_sample_rate

and it essentially works like this:

- If the duration exceeds log_min_duration_statement, we start sampling
the commands with log_statement_sample rate.

- If the duration exceeds log_statement_sample_limit, we just log the
command every time (i.e. we disable sampling, using sample rate 1.0).

IMO that's bound to be confusing for users, because one threshold
behaves as minimum while the other behaves as maximum.

What I think we should do instead is to use two minimum thresholds.

1) log_min_duration_sample - enables sampling of commands, using the
existing GUC log_statement_sample_rate

2) log_min_duration_statement - logs all commands exceeding this

I think this is going to be much easier for users to understand.

The one difference between those approaches is in how they work with
existing current settings. That is, let's say you have

log_min_duration_statement = 1000
log_statement_sample_rate = 0.01

then no queries below 1000ms will be logged, and 1% of longer queries
will be sampled. And with the original config (as proposed in v3 of the
patch), this would still work the same way.

With the new approach (two min thresholds) it'd behave differently,
because we'd log *all* queries longer than 1000ms (not just 1%). And
whether we'd sample any queries (using log_statement_sample_rate) would
depend on how we'd pick the default value for the other threshold.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-07-27 22:20:52 Re: Testing LISTEN/NOTIFY more effectively
Previous Message Andrew Dunstan 2019-07-27 21:48:39 Re: tap tests driving the database via psql