Re: Transaction timeout

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Japin Li <japinli(at)hotmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, 邱宇航 <iamqyh(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Andrew Borodin <amborodin86(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers mailing list <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Transaction timeout
Date: 2024-02-15 23:08:56
Message-ID: 20240215230856.pc6k57tqxt7fhldm@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-02-13 23:42:35 +0200, Alexander Korotkov wrote:
> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
> index 464858117e0..a124ba59330 100644
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -2139,6 +2139,10 @@ StartTransaction(void)
> */
> s->state = TRANS_INPROGRESS;
>
> + /* Schedule transaction timeout */
> + if (TransactionTimeout > 0)
> + enable_timeout_after(TRANSACTION_TIMEOUT, TransactionTimeout);
> +
> ShowTransactionState("StartTransaction");
> }

Isn't it a problem that all uses of StartTransaction() trigger a timeout, but
transaction commit/abort don't? What if e.g. logical replication apply starts
a transaction, commits it, and then goes idle? The timer would still be
active, afaict?

I don't think it works well to enable timeouts in xact.c and to disable them
in PostgresMain().

> @@ -4491,12 +4511,18 @@ PostgresMain(const char *dbname, const char *username)
> pgstat_report_activity(STATE_IDLEINTRANSACTION_ABORTED, NULL);
>
> /* Start the idle-in-transaction timer */
> - if (IdleInTransactionSessionTimeout > 0)
> + if (IdleInTransactionSessionTimeout > 0
> + && (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
> {
> idle_in_transaction_timeout_enabled = true;
> enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
> IdleInTransactionSessionTimeout);
> }
> +
> + /* Schedule or reschedule transaction timeout */
> + if (TransactionTimeout > 0 && !get_timeout_active(TRANSACTION_TIMEOUT))
> + enable_timeout_after(TRANSACTION_TIMEOUT,
> + TransactionTimeout);
> }
> else if (IsTransactionOrTransactionBlock())
> {
> @@ -4504,12 +4530,18 @@ PostgresMain(const char *dbname, const char *username)
> pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);
>
> /* Start the idle-in-transaction timer */
> - if (IdleInTransactionSessionTimeout > 0)
> + if (IdleInTransactionSessionTimeout > 0
> + && (IdleInTransactionSessionTimeout < TransactionTimeout || TransactionTimeout == 0))
> {
> idle_in_transaction_timeout_enabled = true;
> enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
> IdleInTransactionSessionTimeout);
> }
> +
> + /* Schedule or reschedule transaction timeout */
> + if (TransactionTimeout > 0 && !get_timeout_active(TRANSACTION_TIMEOUT))
> + enable_timeout_after(TRANSACTION_TIMEOUT,
> + TransactionTimeout);
> }
> else
> {

Why do we need to do anything in these cases if the timer is started in
StartTransaction()?

> new file mode 100644
> index 00000000000..ce2c9a43011
> --- /dev/null
> +++ b/src/test/isolation/specs/timeouts-long.spec
> @@ -0,0 +1,35 @@
> +# Tests for transaction timeout that require long wait times
> +
> +session s7
> +step s7_begin
> +{
> + BEGIN ISOLATION LEVEL READ COMMITTED;
> + SET transaction_timeout = '1s';
> +}
> +step s7_commit_and_chain { COMMIT AND CHAIN; }
> +step s7_sleep { SELECT pg_sleep(0.6); }
> +step s7_abort { ABORT; }
> +
> +session s8
> +step s8_begin
> +{
> + BEGIN ISOLATION LEVEL READ COMMITTED;
> + SET transaction_timeout = '900ms';
> +}
> +# to test that quick query does not restart transaction_timeout
> +step s8_select_1 { SELECT 1; }
> +step s8_sleep { SELECT pg_sleep(0.6); }
> +
> +session checker
> +step checker_sleep { SELECT pg_sleep(0.3); }

Isn't this test going to be very fragile on busy / slow machines? What if the
pg_sleep() takes one second, because there were other tasks to schedule? I'd
be surprised if this didn't fail under valgrind, for example.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-02-15 23:32:19 Re: glibc qsort() vulnerability
Previous Message Peter Geoghegan 2024-02-15 23:01:50 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan