Re: Log a sample of transactions

From: Adrien NAYRAT <adrien(dot)nayrat(at)anayrat(dot)info>
To: "Kuroda, Hayato" <kuroda(dot)hayato(at)jp(dot)fujitsu(dot)com>, 'David Steele' <david(at)pgmasters(dot)net>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "'pgsql-hackers(at)lists(dot)postgresql(dot)org'" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Log a sample of transactions
Date: 2019-03-26 10:02:10
Message-ID: 02704f8f-319c-6d52-a7ba-81659a74bfe6@anayrat.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/26/19 1:54 AM, Kuroda, Hayato wrote:
> Dear David,
>
> I have a will and already read the patch, but I thought it's not my turn.
> Sorry.
>
>

Hello,

> Adrien,
>
>> I did not find any test for log_min_duration that could help me. LCOV indicate
>> there is no tests on this part (look check_log_duration()):
>> https://coverage.postgresql.org/src/backend/tcop/postgres.c.gcov.html
>
> I understand the unnecessarily of some test case. It's OK.
>
> Finally, how do you think about the deviation of randomness?
> If this parameter is set very low, nobody may be output because of the deviation.
> we can avoid this phenomenon by counting up internal parameter for each transactions and output to log file if the parameter becomes more than 1.

I don't understand what you want to do and I am afraid of the cost it
could add.

There was a long discussion about random, maybe the thread could answer
some of your questions?
https://www.postgresql.org/message-id/3859.1545849900%40sss.pgh.pa.us

Furthermore, I wonder if it is really necessary : the goal of this patch
is to log a sample of transaction, not to perform precise stats
computation. FIY we do not perform this kind of checks for
log_statement_sample_rate.

>
> After consideration for this case and rebasing, I think this patch is enough.
> Do I have to measure the change of throughput?
>

I don't think it is necessary or maybe I am missing something. I already
perform performance test for log_statement_sample_rate:
https://www.postgresql.org/message-id/8a608ccf-f78b-dfad-4c5f-8e6fd068c59c%40anayrat.info

It should be less expensive as we call random() for new transaction
only. log_statement_sample_rate call it for each statement.

I don't know if there is cases where random() call could be slow?

There is still a point raised by Andres:

>> As far as I can tell xact_is_sampled is not properly reset in case of
>> errors?
>>
>
> I am not sure if I should disable logging in case of errors. Actually we have:
>
> postgres=# set log_transaction_sample_rate to 1;
> SET
> postgres=# set client_min_messages to 'LOG';
> LOG: duration: 0.392 ms statement: set client_min_messages to 'LOG';
> SET
> postgres=# begin;
> LOG: duration: 0.345 ms statement: begin;
> BEGIN
> postgres=# select 1;
> LOG: duration: 0.479 ms statement: select 1;
> ?column?
> ----------
> 1
> (1 row)
>
> postgres=# select * from missingtable;
> ERROR: relation "missingtable" does not exist
> LINE 1: select * from missingtable;
> ^
> postgres=# select 1;
> ERROR: current transaction is aborted, commands ignored until end of transaction block
> postgres=# rollback;
> LOG: duration: 11390.295 ms statement: rollback;
> ROLLBACK
>
> If I reset xact_is_sampled (after the first error inside AbortTransaction if I am right), "rollback" statement will not be logged. I wonder if this "statement" should be logged?
>
> If the answer is no, I will reset xact_is_sampled in AbortTransaction.

Thanks!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Imai, Yoshikazu 2019-03-26 10:02:55 RE: speeding up planning with partitions
Previous Message Alexander Korotkov 2019-03-26 09:59:41 Re: jsonpath