Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2019-09-28 19:09:17
Message-ID: 20190928190917.hrpknmq76v3ts3lj@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 28, 2019 at 01:36:46PM +0530, Amit Kapila wrote:
>On Fri, Sep 27, 2019 at 4:55 PM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> On Fri, Sep 27, 2019 at 02:33:32PM +0530, Amit Kapila wrote:
>> >On Tue, Jan 9, 2018 at 7:55 AM Peter Eisentraut
>> ><peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>> >>
>> >> On 1/3/18 14:53, Tomas Vondra wrote:
>> >> >> I don't see the need to tie this setting to maintenance_work_mem.
>> >> >> maintenance_work_mem is often set to very large values, which could
>> >> >> then have undesirable side effects on this use.
>> >> >
>> >> > Well, we need to pick some default value, and we can either use a fixed
>> >> > value (not sure what would be a good default) or tie it to an existing
>> >> > GUC. We only really have work_mem and maintenance_work_mem, and the
>> >> > walsender process will never use more than one such buffer. Which seems
>> >> > to be closer to maintenance_work_mem.
>> >> >
>> >> > Pretty much any default value can have undesirable side effects.
>> >>
>> >> Let's just make it an independent setting unless we know any better. We
>> >> don't have a lot of settings that depend on other settings, and the ones
>> >> we do have a very specific relationship.
>> >>
>> >> >> Moreover, the name logical_work_mem makes it sound like it's a logical
>> >> >> version of work_mem. Maybe we could think of another name.
>> >> >
>> >> > I won't object to a better name, of course. Any proposals?
>> >>
>> >> logical_decoding_[work_]mem?
>> >>
>> >
>> >Having a separate variable for this can give more flexibility, but
>> >OTOH it will add one more knob which user might not have a good idea
>> >to set. What are the problems we see if directly use work_mem for
>> >this case?
>> >
>>
>> IMHO it's similar to autovacuum_work_mem - we have an independent
>> setting, but most people use it as -1 so we use maintenance_work_mem as
>> a default value. I think it makes sense to do the same thing here.
>>
>> It does ad an extra knob anyway (I don't think we should just use
>> maintenance_work_mem directly, the user should have an option to
>> override it when needed). But most users will not notice.
>>
>> FWIW I don't think we should use work_mem, maintenace_work_mem seems
>> somewhat more appropriate here (not related to queries, etc.).
>>
>
>I have the same concern for using maintenace_work_mem as Peter E.
>which is that the value of maintenace_work_mem will generally be
>higher which is suitable for its current purpose, but not for the
>purpose this patch is using. AFAIU, at this stage we want a better
>memory accounting system for logical decoding and we are not sure what
>is a good value for this variable. So, I think using work_mem or
>maintenace_work_mem should serve the purpose. Later, if we have
>requirements from people to have better control over the memory
>required for this purpose then we can introduce a new variable.
>
>I understand that currently work_mem is primarily tied with memory
>used for query workspaces, but it might be okay to extend it for this
>purpose. Another point is that the default for that sound to be more
>appealing for this case. I can see the argument against it which is
>having a separate variable will make the things look clean and give
>better control. So, if we can't convince ourselves for using
>work_mem, we can introduce a new guc variable and keep the default as
>4MB or work_mem.
>
>I feel it is always tempting to introduce a new guc for the different
>tasks unless there is an exact match, but OTOH, having lesser guc's
>has its own advantage which is that people don't have to bother about
>a new setting which they need to tune and especially for which they
>can't decide with ease. I am not telling that we should not introduce
>new guc when it is required, but just to give more thought before
>doing so.
>

I do think having a separate GUC is a must, irrespectedly of what other
GUC (if any) is used as a default. You're right the maintenance_work_mem
value might be too high (e.g. in cases with many subscriptions), but the
same issue applies to work_mem - there's no guarantee work_mem is lower
than maintenance_work_mem, and in analytics databases it may be set very
high. So work_mem does not really solve the issue

IMHO we can't really do without a new GUC. It's not difficult to create
examples that would benefit from small/large memory limit, depending on
the number of subscriptions etc.

I do however agree the GUC does not have to be tied to any existing one,
it was just an attempt to use a more sensible default value. I do think
m_w_m would be fine, but I can live with using an explicit value.

So that's what I did in the attached patch - I've renamed the GUC to
logical_decoding_work_mem, detached it from m_w_m and set the default to
64MB (i.e. the same default as m_w_m). It should also fix all the issues
from the recent reviews (at least I believe so).

I've realized that one of the subsequent patches allows overriding the
limit for individual subscriptions (in the CREATE SUBSCRIPTION command).
I think it'd be good to move this bit forward, but I think it can be
done in a separate patch.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Add-logical_decoding_work_mem-to-limit-ReorderBuffer.patch.gz application/gzip 10.2 KB
0002-Immediately-WAL-log-assignments.patch.gz application/gzip 6.6 KB
0003-Issue-individual-invalidations-with-wal_level-logica.patch.gz application/gzip 4.9 KB
0004-Extend-the-output-plugin-API-with-stream-methods.patch.gz application/gzip 6.1 KB
0005-Cleaning-up-of-flags-in-ReorderBufferTXN-structure.patch.gz application/gzip 2.5 KB
0006-Support-decoding-of-two-phase-transactions-at-PREPAR.patch.gz application/gzip 9.0 KB
0007-Gracefully-handle-concurrent-aborts-of-uncommitted.patch.gz application/gzip 4.4 KB
0008-Teach-test_decoding-plugin-to-work-with-2PC.patch.gz application/gzip 5.8 KB
0009-Implement-streaming-mode-in-ReorderBuffer.patch.gz application/gzip 12.6 KB
0010-Add-support-for-streaming-to-built-in-replication.patch.gz application/gzip 19.8 KB
0011-Track-statistics-for-streaming-spilling.patch.gz application/gzip 4.2 KB
0012-Enable-streaming-for-all-subscription-TAP-tests.patch.gz application/gzip 2.3 KB
0013-BUGFIX-set-final_lsn-for-subxacts-before-cleanup.patch.gz application/gzip 628 bytes
0014-Add-TAP-test-for-streaming-vs.-DDL.patch.gz application/gzip 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-09-28 19:59:00 Re: Usage of the system truststore for SSL certificate validation
Previous Message David Steele 2019-09-28 17:49:29 Re: Standby accepts recovery_target_timeline setting?