Re: Skip collecting decoded changes of already-aborted transactions

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skip collecting decoded changes of already-aborted transactions
Date: 2023-06-13 08:35:45
Message-ID: CAD21AoBQQtkgeLPMG+JPKHzOi5aksp4_cVnbXEHzQQjb4cOaAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 11, 2023 at 5:31 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2023-06-09 14:16:44 +0900, Masahiko Sawada wrote:
> > In logical decoding, we don't need to collect decoded changes of
> > aborted transactions. While streaming changes, we can detect
> > concurrent abort of the (sub)transaction but there is no mechanism to
> > skip decoding changes of transactions that are known to already be
> > aborted. With the attached WIP patch, we check CLOG when decoding the
> > transaction for the first time. If it's already known to be aborted,
> > we skip collecting decoded changes of such transactions. That way,
> > when the logical replication is behind or restarts, we don't need to
> > decode large transactions that already aborted, which helps improve
> > the decoding performance.
>

Thank you for the comment.

> It's very easy to get uses of TransactionIdDidAbort() wrong. For one, it won't
> return true when a transaction was implicitly aborted due to a crash /
> restart. You're also supposed to use it only after a preceding
> TransactionIdIsInProgress() call.
>
> I'm not sure there are issues with not checking TransactionIdIsInProgress()
> first in this case, but I'm also not sure there aren't.

Yeah, it seems to be better to use !TransactionIdDidCommit() with a
preceding TransactionIdIsInProgress() check.

>
> A separate issue is that TransactionIdDidAbort() can end up being very slow if
> a lot of transactions are in progress concurrently. As soon as the clog
> buffers are extended all time is spent copying pages from the kernel
> pagecache. I'd not at all be surprised if this changed causes a substantial
> slowdown in workloads with lots of small transactions, where most transactions
> commit.
>

Indeed. So it should check the transaction status less frequently. It
doesn't benefit much even if we can skip collecting decoded changes of
small transactions. Another idea is that we check the status of only
large transactions. That is, when the size of decoded changes of an
aborted transaction exceeds logical_decoding_work_mem, we mark it as
aborted , free its changes decoded so far, and skip further
collection.

Regards

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-06-13 08:46:58 Re: Let's make PostgreSQL multi-threaded
Previous Message Amit Langote 2023-06-13 08:27:30 Re: Views no longer in rangeTabls?