Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-06-09 09:34:33
Message-ID: CAA4eK1+Mpb8GYDVmc6mcwL46w9RWR5SAG_GCHHBwxLQxQ6tYBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 8, 2020 at 11:53 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> I think one of the usages we still need is in ReorderBufferForget
> because it can be called when we skip processing the txn. See the
> comments in DecodeCommit where we call this function. If I am
> correct, we need to probably collect all invalidations in
> ReorderBufferTxn as we are collecting tuplecids and use them here. We
> can do the same during processing of XLOG_XACT_INVALIDATIONS.
>

One more point related to this is that after this patch series, we
need to consider executing all invalidation during transaction abort.
Because it is possible that due to memory overflow, we have processed
some of the messages which also contain a few XACT_INVALIDATION
messages, so to avoid cache pollution, we need to execute all of them
in abort. We also do the similar thing in Rollback/Rollback To
Savepoint, see AtEOXact_Inval and AtEOSubXact_Inval.

Few other comments on
0002-Issue-individual-invalidations-with-wal_level-lo.patch
---------------------------------------------------------------------------------------------------------------
1.
+ if (transInvalInfo->CurrentCmdInvalidMsgs.cclist)
+ {
+ ProcessInvalidationMessagesMulti(&transInvalInfo->CurrentCmdInvalidMsgs,
+ MakeSharedInvalidMessagesArray);
+ invalMessages = SharedInvalidMessagesArray;
+ nmsgs = numSharedInvalidMessagesArray;
+ SharedInvalidMessagesArray = NULL;
+ numSharedInvalidMessagesArray = 0;

a. Immediately after ProcessInvalidationMessagesMulti, isn't it better
to have an Assertion like Assert(!(numSharedInvalidMessagesArray > 0
&& SharedInvalidMessagesArray == NULL));?
b. Why check "if (transInvalInfo->CurrentCmdInvalidMsgs.cclist)" is
required? If you see xactGetCommittedInvalidationMessages where we do
something similar, we only check for valid value of transInvalInfo and
here we check the same in the caller of LogLogicalInvalidations, isn't
that sufficient? If that is sufficient, we can either have the same
check here or have an Assert for the same.

2.
@@ -1092,6 +1101,9 @@ CommandEndInvalidationMessages(void)
if (transInvalInfo == NULL)
return;

+ if (XLogLogicalInfoActive())
+ LogLogicalInvalidations();
+
ProcessInvalidationMessages(&transInvalInfo->CurrentCmdInvalidMsgs,
LocalExecuteInvalidationMessage);
Generally, we WAL log the action after performing it but here you are
writing WAL first. Is there any specific reason? If so, can we write
a comment about the same?

3.
+ * When wal_level=logical, write invalidations into WAL at each command end to
+ * support the decoding of the in-progress transaction. As of now it was
+ * enough to log invalidation only at commit because we are only decoding the
+ * transaction at the commit time. We only need to log the catalog cache and
+ * relcache invalidation. There can not be any active MVCC scan in logical
+ * decoding so we don't need to log the snapshot invalidation.

I think this comment doesn't hold good after we have changed the patch
to LOG invalidations at the time of CCI.

4.
+
+/*
+ * Emit WAL for invalidations.
+ */
+static void
+LogLogicalInvalidations()

Add the function name atop of this function in comments to match the
style with other nearby functions. How about modifying it to
something like: "Emit WAL for invalidations. This is currently only
used for logging invalidations at the command end."

5.
+ *
+ * XXX Do we need to care about relcacheInitFileInval and
+ * the other fields added to ReorderBufferChange, or just
+ * about the message itself?
+ */

I don't think we need to do anything about relcacheInitFileInval.
This is used to remove the stale files (RELCACHE_INIT_FILENAME) that
have obsolete information about relcache. The walsender process that
is doing decoding doesn't require us to do anything about this. Also,
if you see before this patch, we don't do anything about relcache
files during decoding of invalidation messages. In short, I think we
can remove this comment unless you see some use of it.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-06-09 09:46:24 Re: proposal: possibility to read dumped table's name from file
Previous Message Andrey Lepikhov 2020-06-09 09:20:42 Re: Asynchronous Append on postgres_fdw nodes.