Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-06-18 15:32:06
Message-ID: CAFiTN-t7WZZjFrAjSYj4fu=FZ2JKENN8ZHCUZaw-srnrHMWMrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 9, 2020 at 3:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Jun 8, 2020 at 11:53 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > I think one of the usages we still need is in ReorderBufferForget
> > because it can be called when we skip processing the txn. See the
> > comments in DecodeCommit where we call this function. If I am
> > correct, we need to probably collect all invalidations in
> > ReorderBufferTxn as we are collecting tuplecids and use them here. We
> > can do the same during processing of XLOG_XACT_INVALIDATIONS.
> >
>
> One more point related to this is that after this patch series, we
> need to consider executing all invalidation during transaction abort.
> Because it is possible that due to memory overflow, we have processed
> some of the messages which also contain a few XACT_INVALIDATION
> messages, so to avoid cache pollution, we need to execute all of them
> in abort. We also do the similar thing in Rollback/Rollback To
> Savepoint, see AtEOXact_Inval and AtEOSubXact_Inval.

Yes, we need to do that, So now we are collecting all the
invalidation under txn->invalidation so they are getting executed on
abort.

>
> Few other comments on
> 0002-Issue-individual-invalidations-with-wal_level-lo.patch
> ---------------------------------------------------------------------------------------------------------------
> 1.
> + if (transInvalInfo->CurrentCmdInvalidMsgs.cclist)
> + {
> + ProcessInvalidationMessagesMulti(&transInvalInfo->CurrentCmdInvalidMsgs,
> + MakeSharedInvalidMessagesArray);
> + invalMessages = SharedInvalidMessagesArray;
> + nmsgs = numSharedInvalidMessagesArray;
> + SharedInvalidMessagesArray = NULL;
> + numSharedInvalidMessagesArray = 0;
>
> a. Immediately after ProcessInvalidationMessagesMulti, isn't it better
> to have an Assertion like Assert(!(numSharedInvalidMessagesArray > 0
> && SharedInvalidMessagesArray == NULL));?

Done

> b. Why check "if (transInvalInfo->CurrentCmdInvalidMsgs.cclist)" is
> required? If you see xactGetCommittedInvalidationMessages where we do
> something similar, we only check for valid value of transInvalInfo and
> here we check the same in the caller of LogLogicalInvalidations, isn't
> that sufficient? If that is sufficient, we can either have the same
> check here or have an Assert for the same.

I have put the same check here.

>
> 2.
> @@ -1092,6 +1101,9 @@ CommandEndInvalidationMessages(void)
> if (transInvalInfo == NULL)
> return;
>
> + if (XLogLogicalInfoActive())
> + LogLogicalInvalidations();
> +
> ProcessInvalidationMessages(&transInvalInfo->CurrentCmdInvalidMsgs,
> LocalExecuteInvalidationMessage);
> Generally, we WAL log the action after performing it but here you are
> writing WAL first. Is there any specific reason? If so, can we write
> a comment about the same?

Yeah, there is no reason for the same so moved it down.

>
> 3.
> + * When wal_level=logical, write invalidations into WAL at each command end to
> + * support the decoding of the in-progress transaction. As of now it was
> + * enough to log invalidation only at commit because we are only decoding the
> + * transaction at the commit time. We only need to log the catalog cache and
> + * relcache invalidation. There can not be any active MVCC scan in logical
> + * decoding so we don't need to log the snapshot invalidation.
>
> I think this comment doesn't hold good after we have changed the patch
> to LOG invalidations at the time of CCI.

Right, modified.

>
> 4.
> +
> +/*
> + * Emit WAL for invalidations.
> + */
> +static void
> +LogLogicalInvalidations()
>
> Add the function name atop of this function in comments to match the
> style with other nearby functions. How about modifying it to
> something like: "Emit WAL for invalidations. This is currently only
> used for logging invalidations at the command end."

Done

>
> 5.
> + *
> + * XXX Do we need to care about relcacheInitFileInval and
> + * the other fields added to ReorderBufferChange, or just
> + * about the message itself?
> + */
>
> I don't think we need to do anything about relcacheInitFileInval.
> This is used to remove the stale files (RELCACHE_INIT_FILENAME) that
> have obsolete information about relcache. The walsender process that
> is doing decoding doesn't require us to do anything about this. Also,
> if you see before this patch, we don't do anything about relcache
> files during decoding of invalidation messages. In short, I think we
> can remove this comment unless you see some use of it.

Now, we have removed the Invalidation change itself so this comment is gone.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-06-18 15:42:16 Re: global barrier & atomics in signal handlers (Re: Atomic operations within spinlocks)
Previous Message Dilip Kumar 2020-06-18 15:31:50 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions