Re: Log a sample of transactions

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Adrien NAYRAT <adrien(dot)nayrat(at)anayrat(dot)info>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Kuroda, Hayato" <kuroda(dot)hayato(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Log a sample of transactions
Date: 2019-03-28 02:18:48
Message-ID: CAD21AoASTErA0LShc=ozwWrfuTVU6=9_QZDYCK_Smti+ZU4N1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for updating the patch!

On Wed, Mar 27, 2019 at 10:28 PM Adrien NAYRAT
<adrien(dot)nayrat(at)anayrat(dot)info> wrote:
>
> On 3/27/19 10:22 AM, Adrien NAYRAT wrote:
> > Hello,
> >
> > On 3/27/19 7:05 AM, Masahiko Sawada wrote:
> >>>>> Why are both of these checked? ISTM once xact_is_sampled is set, we
> >>>>> ought not to respect a different value of log_xact_sample_rate for the
> >>>>> rest of that transaction.
> >>>> I added theses checks to allow to disable logging during a sampled
> >>>> transaction, per suggestion of Masahiko Sawada:
> >>>> https://www.postgresql.org/message-id/CAD21AoD4t%2BhTV6XfK5Yz%3DEocB8oMyJSYFfjAryCDYtqfib2GrA%40mail.gmail.com
> >>>>
> >>> I added a comment to explain why transaction logging is rechecked.
> >> Umm, I'm inclined to think that we should not disable logging within
> >> the transaction. If we allow that, since users can disable the logging
> >> within the transaction by setting it to 0 they may think that we can
> >> change the rate during the transaction, which is wrong. If we want
> >> this behavior we should document it but we can make user not being
> >> able to change the rate during the transaction to avoid confusion. And
> >> the latter looks more understandable and straightforward. This is
> >> different comment what I did before, I'm sorry for confusing you.
> >
> > Don't worry, I will change that.
> >
> >
> >>
> >>>>> As far as I can tell xact_is_sampled is not properly reset in case of
> >>>>> errors?
> >>>>>
> >>> I am not sure if I should disable logging in case of errors. Actually we
> >>> have:
> >>>
> >>> postgres=# set log_transaction_sample_rate to 1;
> >>> SET
> >>> postgres=# set client_min_messages to 'LOG';
> >>> LOG: duration: 0.392 ms statement: set client_min_messages to 'LOG';
> >>> SET
> >>> postgres=# begin;
> >>> LOG: duration: 0.345 ms statement: begin;
> >>> BEGIN
> >>> postgres=# select 1;
> >>> LOG: duration: 0.479 ms statement: select 1;
> >>> ?column?
> >>> ----------
> >>> 1
> >>> (1 row)
> >>>
> >>> postgres=# select * from missingtable;
> >>> ERROR: relation "missingtable" does not exist
> >>> LINE 1: select * from missingtable;
> >>> ^
> >>> postgres=# select 1;
> >>> ERROR: current transaction is aborted, commands ignored until end of
> >>> transaction block
> >>> postgres=# rollback;
> >>> LOG: duration: 11390.295 ms statement: rollback;
> >>> ROLLBACK
> >>>
> >>> If I reset xact_is_sampled (after the first error inside
> >>> AbortTransaction if I am right), "rollback" statement will not be
> >>> logged. I wonder if this "statement" should be logged?
> >>>
> >>> If the answer is no, I will reset xact_is_sampled in AbortTransaction.
> >>>
> >> FWIW, I'd prefer to log "rollback" as well so as user can recognize
> >> the end of transaction.
> >
> > Ok.
> >
> >>
> >> + /* Determine if statements are logged in this transaction */
> >> + xact_is_sampled = random() <= log_xact_sample_rate *
> >> MAX_RANDOM_VALUE;
> >>
> >> If log_xact_sample_rate is 1 we always log all statement in the
> >> transaction regardless of value of random(). Similarly if it's 0, we
> >> never log. I think we can avoid unnecessary random() call in both case
> >> like log_statement_sample_rate does.
> >
> > I agree, I will change that.
> >
> >>
> >> {"log_statement_sample_rate", PGC_SUSET, LOGGING_WHEN,
> >> gettext_noop("Fraction of statements over
> >> log_min_duration_statement to log."),
> >> gettext_noop("If you only want a sample, use a
> >> value between 0 (never "
> >> - "log) and 1.0 (always
> >> log).")
> >> + "log) and 1 (always
> >> log).")
> >> },
> >> &log_statement_sample_rate,
> >> 1.0, 0.0, 1.0,
> >> NULL, NULL, NULL
> >> },
> >>
> >> This change is not relevant with log_transaction_sample_rate feature
> >> but why do we need this change? In postgresql.conf.sample the
> >> description of both log_statement_sample_rate and
> >> log_transaction_sample_rate use 1.0 instead of 1, like "1.0 logs all
> >> statements from all transactions, 0 never logs". If we need this
> >> change I think it should be a separate patch.
> >
> >
> > Sorry, I changed that, someone suggest using either "0" and "1", or
> > "0.0" and "1.0" but not mixing both.

Agreed with using "0.0" and "1.0".

> > I will remove this change.
> >

--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
+#log_transaction_sample_rate = 0 # Fraction of transactions
whose statements
+ # are logged regardless of
their duration. 1.0 logs all
+ # statements from all
transactions, 0 never logs.

--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
+ {"log_transaction_sample_rate", PGC_SUSET, LOGGING_WHEN,
+ gettext_noop("Set the fraction of transactions
to log for new transactions."),
+ gettext_noop("Logs all statements from a
fraction of transactions. "
+ "Use a value between
0 (never log) and 1 (log all "
+ "statements for all
transactions).")

I think it would be better to use the same number in both guc.c and
postgresql.conf.sample. How about unifying them to "0.0" and "1.0" for
consistency?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-03-28 02:40:56 Re: Ordered Partitioned Table Scans
Previous Message Jeff Janes 2019-03-28 01:43:44 Re: pg_upgrade: Pass -j down to vacuumdb