Re: New GUC to sample log queries

From: Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New GUC to sample log queries
Date: 2018-05-31 12:37:07
Message-ID: 8a608ccf-f78b-dfad-4c5f-8e6fd068c59c@anayrat.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/31/2018 03:34 AM, David Rowley wrote:
> On 31 May 2018 at 06:44, Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info> wrote:
>> Here is a naive SELECT only bench with a dataset which fit in ram (scale factor
>> = 100) and PGDATA and log on a ramdisk:
>> shared_buffers = 4GB
>> seq_page_cost = random_page_cost = 1.0
>> logging_collector = on (no rotation)
>
> It would be better to just: SELECT 1; to try to get the true overhead
> of the additional logging code.

Right, I wonder why I didn't have the idea :)

>
>> I don't know the cost of random() call?
>
> It's probably best to test in Postgres to see if there's an overhead
> to the new code. It may be worth special casing the 0 and 1 case so
> random() is not called.

Thanks, I based on auto_explain's code. I will send a patch for auto_explain
with same changes.

>
> + (random() < log_sample_rate * MAX_RANDOM_VALUE);
>
> this should be <=, or you'll randomly miss logging a query when
> log_sample_rate is 1.0 every 4 billion or so queries.

Right, it is the same in auto_explain.

>
> Of course, it would be better if we had a proper profiler, but I can
> see your need for this. Enabling logging of all queries in production
> is currently reserved for people with low traffic servers and the
> insane.

Yes, here are results with "SELECT 1" and changes to avoid random() call with 0
and 1.

master :
log_min_duration_statement = 0
TPS: 41294
log size: 2477589 lines (199MB)

log_min_duration_statement = -1
TPS: 49478
log size: 0 lines

patched:
log_min_duration_statement = 0
log_sample_rate = 1
TPS: 41635
log size: 2497994 lines (201MB)

log_min_duration_statement = 0
log_sample_rate = 0.1
TPS: 46915
log size: 282303 lines (23MB)

log_min_duration_statement = 0
log_sample_rate = 0.01
TPS: 48867
log size: 29193 lines (2.4MB)

log_min_duration_statement = 0
log_sample_rate = 0
TPS: 48912
log size: 0 lines

log_min_duration_statement = -1
log_sample_rate = 1
TPS: 49061
log size: 0 lines

Difference between l_m_d_s = 0 and -1 is about 16% and logger process eat around
30% CPU on my laptop.

Revised patch attached.

Thanks,

--
Adrien NAYRAT

Attachment Content-Type Size
sample_rate-2.patch text/x-patch 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-05-31 12:38:13 Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Previous Message REIX, Tony 2018-05-31 12:28:37 RE:PostgreSQL 11 beta1 on AIX 7.2 : 2 failures in 32bit mode