Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2019-10-18 12:01:58
Message-ID: CAA4eK1LXen1B2dXZeimMvb+St9gm0+NQ_jFDsE=oVtCTwZY19A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 14, 2019 at 3:09 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Oct 3, 2019 at 4:03 AM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >
> >
> > Sure, I wasn't really proposing to adding all stats from that patch,
> > including those related to streaming. We need to extract just those
> > related to spilling. And yes, it needs to be moved right after 0001.
> >
> I have extracted the spilling related code to a separate patch on top
> of 0001. I have also fixed some bugs and review comments and attached
> as a separate patch. Later I can merge it to the main patch if you
> agree with the changes.
>

Few comments
-------------------------
0001-Add-logical_decoding_work_mem-to-limit-ReorderBuffer
1.
+ {
+ {"logical_decoding_work_mem", PGC_USERSET, RESOURCES_MEM,
+ gettext_noop("Sets the maximum memory to be used for logical decoding."),
+ gettext_noop("This much memory can be used by each internal "
+ "reorder buffer before spilling to disk or streaming."),
+ GUC_UNIT_KB
+ },

I think we can remove 'or streaming' from above sentence for now. We
can add it later with later patch where streaming will be allowed.

2.
@@ -206,6 +206,18 @@ CREATE SUBSCRIPTION <replaceable
class="parameter">subscription_name</replaceabl
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>work_mem</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ Limits the amount of memory used to decode changes on the
+ publisher. If not specified, the publisher will use the default
+ specified by <varname>logical_decoding_work_mem</varname>. When
+ needed, additional data are spilled to disk.
+ </para>
+ </listitem>
+ </varlistentry>

It is not clear why we need this parameter at least with this patch?
I have raised this multiple times [1][2].

bugs_and_review_comments_fix
1.
},
&logical_decoding_work_mem,
- -1, -1, MAX_KILOBYTES,
- check_logical_decoding_work_mem, NULL, NULL
+ 65536, 64, MAX_KILOBYTES,
+ NULL, NULL, NULL

I think the default value should be 1MB similar to
maintenance_work_mem. The same was true before this change.

2. -#logical_decoding_work_mem = 64MB # min 1MB, or -1 to use
maintenance_work_mem
+i#logical_decoding_work_mem = 64MB # min 64kB

It seems the 'i' is a leftover character in the above change. Also,
change the default value considering the previous point.

3.
@@ -2479,7 +2480,7 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn)

/* update the statistics */
rb->spillCount += 1;
- rb->spillTxns += txn->serialized ? 1 : 0;
+ rb->spillTxns += txn->serialized ? 0 : 1;
rb->spillBytes += size;

Why is this change required? Shouldn't we increase the spillTxns
count only when the txn is serialized?

0002-Track-statistics-for-spilling
1.
+ <row>
+ <entry><structfield>spill_txns</structfield></entry>
+ <entry><type>integer</type></entry>
+ <entry>Number of transactions spilled to disk after the memory used by
+ logical decoding exceeds <literal>logical_work_mem</literal>. The
+ counter gets incremented both for toplevel transactions and
+ subtransactions.
+ </entry>
+ </row>

The parameter name is wrong here. /logical_work_mem/logical_decoding_work_mem

2.
+ <row>
+ <entry><structfield>spill_txns</structfield></entry>
+ <entry><type>integer</type></entry>
+ <entry>Number of transactions spilled to disk after the memory used by
+ logical decoding exceeds <literal>logical_work_mem</literal>. The
+ counter gets incremented both for toplevel transactions and
+ subtransactions.
+ </entry>
+ </row>
+ <row>
+ <entry><structfield>spill_count</structfield></entry>
+ <entry><type>integer</type></entry>
+ <entry>Number of times transactions were spilled to disk. Transactions
+ may get spilled repeatedly, and this counter gets incremented on every
+ such invocation.
+ </entry>
+ </row>
+ <row>
+ <entry><structfield>spill_bytes</structfield></entry>
+ <entry><type>integer</type></entry>
+ <entry>Amount of decoded transaction data spilled to disk.
+ </entry>
+ </row>

In all the above cases, the explanation text starts immediately after
<entry> tag, but the general coding practice is to start from the next
line, see the explanation of nearby parameters.

It seems these parameters are added in pg-stat-wal-receiver-view in
the docs, but in code, it is present as part of pg_stat_replication.
It seems doc needs to be updated. Am, I missing something?

3.
ReorderBufferSerializeTXN()
{
..
/* update the statistics */
rb->spillCount += 1;
rb->spillTxns += txn->serialized ? 0 : 1;
rb->spillBytes += size;

Assert(spilled == txn->nentries_mem);
Assert(dlist_is_empty(&txn->changes));
txn->nentries_mem = 0;
txn->serialized = true;
..
}

I am not able to understand the above code. We are setting the
serialized parameter a few lines after we check it and increment the
spillTxns count. Can you please explain it?

Also, isn't spillTxns count bit confusing, because in some cases it
will include subtransactions and other cases (where the largest picked
transaction is a subtransaction) it won't include it?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-10-18 12:07:58 Re: Clean up MinGW def file generation
Previous Message Lars Kanis 2019-10-18 12:01:23 Re: libpq: Fix wrong connection status on invalid "connect_timeout"