Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-09-02 10:24:26
Message-ID: CAFiTN-tMcVm+yVKgzwNF5y-Xcx3HegLpZq+p7Y=MQpi3i_ObHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 1, 2020 at 8:33 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Sep 1, 2020 at 9:28 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Aug 31, 2020 at 7:28 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Aug 31, 2020 at 1:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > In functions cleanup_rel_sync_cache and
> > get_schema_sent_in_streamed_txn, lets cast the result of lfirst_int to
> > uint32 as suggested by Tom [1]. Also, lets keep the way we compare
> > xids consistent in both functions, i.e, if (xid == lfirst_int(lc)).
> >
>
> Fixed this in the attached patch.
>
> > The behavior tested by the test case added for this is not clear
> > primarily because of comments.
> >
> > +++ b/src/test/subscription/t/021_stream_schema.pl
> > @@ -0,0 +1,80 @@
> > +# Test behavior with streaming transaction exceeding logical_decoding_work_mem
> > ...
> > +# large (streamed) transaction with DDL, DML and ROLLBACKs
> > +$node_publisher->safe_psql('postgres', q{
> > +BEGIN;
> > +ALTER TABLE test_tab ADD COLUMN c INT;
> > +INSERT INTO test_tab SELECT i, md5(i::text), i FROM
> > generate_series(3,3000) s(i);
> > +ALTER TABLE test_tab ADD COLUMN d INT;
> > +COMMIT;
> > +});
> > +
> > +# large (streamed) transaction with DDL, DML and ROLLBACKs
> > +$node_publisher->safe_psql('postgres', q{
> > +BEGIN;
> > +INSERT INTO test_tab SELECT i, md5(i::text), i, i FROM
> > generate_series(3001,3005) s(i);
> > +COMMIT;
> > +});
> > +wait_for_caught_up($node_publisher, $appname);
> >
> > I understand that how this test will test the functionality related to
> > schema_sent stuff but neither the comments atop of file nor atop the
> > test case explains it clearly.
> >
>
> Added comments for this test.
>
> > > > Few more comments:
> >
> > >
> > > > 2.
> > > > 009_stream_simple.pl
> > > > +# Insert, update and delete enough rows to exceed the 64kB limit.
> > > > +$node_publisher->safe_psql('postgres', q{
> > > > +BEGIN;
> > > > +INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5000) s(i);
> > > > +UPDATE test_tab SET b = md5(b) WHERE mod(a,2) = 0;
> > > > +DELETE FROM test_tab WHERE mod(a,3) = 0;
> > > > +COMMIT;
> > > > +});
> > > >
> > > > How much above this data is 64kB limit? I just wanted to see that it
> > > > should not be on borderline and then due to some alignment issues the
> > > > streaming doesn't happen on some machines? Also, how such a test
> > > > ensures that the streaming has happened because the way we are
> > > > checking results, won't it be the same for the non-streaming case as
> > > > well?
> > >
> > > Only for this case, or you mean for all the tests?
> > >
> >
>
> I have not done this yet.
Most of the test cases are generating above 100kb and a few are around
72kb, Please find the test case wise data size.

015 - 200kb
016 - 150kb
017 - 72kb
018 - 72kb before first rollback to sb and total ~100kb
019 - 76kb before first rollback to sb and total ~100kb
020 - 150kb
021 - 100kb

> > It is better to do it for all tests and I have clarified this in my
> > next email sent yesterday [2] where I have raised a few more comments
> > as well. I hope you have not missed that email.

I saw that I think I replied to this before seeing that.

> > > > 3.
> > > > +# Change the local values of the extra columns on the subscriber,
> > > > +# update publisher, and check that subscriber retains the expected
> > > > +# values
> > > > +$node_subscriber->safe_psql('postgres', "UPDATE test_tab SET c =
> > > > 'epoch'::timestamptz + 987654321 * interval '1s'");
> > > > +$node_publisher->safe_psql('postgres', "UPDATE test_tab SET b = md5(a::text)");
> > > > +
> > > > +wait_for_caught_up($node_publisher, $appname);
> > > > +
> > > > +$result =
> > > > + $node_subscriber->safe_psql('postgres', "SELECT count(*),
> > > > count(extract(epoch from c) = 987654321), count(d = 999) FROM
> > > > test_tab");
> > > > +is($result, qq(3334|3334|3334), 'check extra columns contain locally
> > > > changed data');
> > > >
> > > > Again, how this test is relevant to streaming mode?
> > >
> > > I agree, it is not specific to the streaming.
> > >
>
> I think we can leave this as of now. After committing the stats
> patches by Sawada-San and Ajin, we might be able to improve this test.

Make sense to me.

> > +sub wait_for_caught_up
> > +{
> > + my ($node, $appname) = @_;
> > +
> > + $node->poll_query_until('postgres',
> > +"SELECT pg_current_wal_lsn() <= replay_lsn FROM pg_stat_replication
> > WHERE application_name = '$appname';"
> > + ) or die "Timed ou
> >
> > The patch has added this in all the test files if it is used in so
> > many tests then we need to add this in some generic place
> > (PostgresNode.pm) but actually, I am not sure if need this at all. Why
> > can't the existing wait_for_catchup in PostgresNode.pm serve the same
> > purpose.
> >
>
> Changed as per this suggestion.

Okay.

> > 2.
> > In system_views.sql,
> >
> > -- All columns of pg_subscription except subconninfo are readable.
> > REVOKE ALL ON pg_subscription FROM public;
> > GRANT SELECT (subdbid, subname, subowner, subenabled, subbinary,
> > subslotname, subpublications)
> > ON pg_subscription TO public;
> >
> > Here, we need to update for substream column as well.
> >
>
> Fixed.

LGTM

> > 3. Update describeSubscriptions() to show the 'substream' value in \dRs.
> >
> > 4. Also, lets add few tests in subscription.sql as we have added
> > 'binary' option in commit 9de77b5453.
> >
>
> Fixed both the above comments.

Ok

> > 5. I think we can merge pg_dump related changes (the last version
> > posted in mail thread is v53-0005-Add-streaming-option-in-pg_dump) in
> > the main patch, one minor comment on pg_dump related changes
> > @@ -4358,6 +4369,8 @@ dumpSubscription(Archive *fout, SubscriptionInfo *subinfo)
> > if (strcmp(subinfo->subbinary, "t") == 0)
> > appendPQExpBuffer(query, ", binary = true");
> >
> > + if (strcmp(subinfo->substream, "f") != 0)
> > + appendPQExpBuffer(query, ", streaming = on");
> > if (strcmp(subinfo->subsynccommit, "off") != 0)
> > appendPQExpBuffer(query, ", synchronous_commit = %s",
> > fmtId(subinfo->subsynccommit));
> >
> > Keep one line space between substream and subsynccommit option code to
> > keep it consistent with nearby code.
> >
>
> Changed as per this suggestion.

Ok

> I have fixed all the comments except the below comments.
> 1. verify the size of various tests to ensure that it is above
> logical_decoding_work_mem.
> 2. I have checked that in one of the previous patches, we have a test
> v53-0004-Add-TAP-test-for-streaming-vs.-DDL which contains a test case
> quite similar to what we have in
> v55-0002-Add-support-for-streaming-to-built-in-logical-re/013_stream_subxact_ddl_abort.
> If there is any difference that can cover more scenarios then can we
> consider merging them into one test?
> 3. +# Change the local values of the extra columns on the subscriber,
> +# update publisher, and check that subscriber retains the expected
> +# values
> +$node_subscriber->safe_psql('postgres', "UPDATE test_tab SET c =
> 'epoch'::timestamptz + 987654321 * interval '1s'");
> +$node_publisher->safe_psql('postgres', "UPDATE test_tab SET b = md5(a::text)");
> +
> +wait_for_caught_up($node_publisher, $appname);
> +
> +$result =
> + $node_subscriber->safe_psql('postgres', "SELECT count(*),
> count(extract(epoch from c) = 987654321), count(d = 999) FROM
> test_tab");
> +is($result, qq(3334|3334|3334), 'check extra columns contain locally
> changed data');
>
> Again, how this test is relevant to streaming mode?
> 4. Apart from the above, I think we should think of minimizing the
> test cases which can be committed with the base patch. We can later
> add more tests.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-09-02 10:33:06 Re: [HACKERS] Custom compression methods
Previous Message Dilip Kumar 2020-09-02 10:11:25 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions