Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-08-31 07:54:30
Message-ID: CAA4eK1+do8Fdb2q8jDCdJPZCuEzkvsv2STc9LFNCBoFOKUd5HQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 31, 2020 at 10:49 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sun, Aug 30, 2020 at 2:43 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
>
> Another comment:
>
> +cleanup_rel_sync_cache(TransactionId xid, bool is_commit)
> +{
> + HASH_SEQ_STATUS hash_seq;
> + RelationSyncEntry *entry;
> +
> + Assert(RelationSyncCache != NULL);
> +
> + hash_seq_init(&hash_seq, RelationSyncCache);
> + while ((entry = hash_seq_search(&hash_seq)) != NULL)
> + {
> + if (is_commit)
> + entry->schema_sent = true;
>
> How is it correct to set 'entry->schema_sent' for all the entries in
> RelationSyncCache? Consider a case where due to invalidation in an
> unrelated transaction we have set the flag schema_sent for a
> particular relation 'r1' as 'false' and that transaction is executed
> before the current streamed transaction for which we are performing
> commit and called this function. It will set the flag for unrelated
> entry in this case 'r1' which doesn't seem correct to me. Or, if this
> is correct, it would be a good idea to write some comments about it.
>

Few more comments:
1.
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql('postgres',
+"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr
application_name=$appname' PUBLICATION tap_pub"
+);

In most of the tests, we are using the above statement to create a
subscription. Don't we need (streaming = 'on') parameter while
creating a subscription? Is there a reason for not doing so in this
patch itself?

2.
009_stream_simple.pl
+# Insert, update and delete enough rows to exceed the 64kB limit.
+$node_publisher->safe_psql('postgres', q{
+BEGIN;
+INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, 5000) s(i);
+UPDATE test_tab SET b = md5(b) WHERE mod(a,2) = 0;
+DELETE FROM test_tab WHERE mod(a,3) = 0;
+COMMIT;
+});

How much above this data is 64kB limit? I just wanted to see that it
should not be on borderline and then due to some alignment issues the
streaming doesn't happen on some machines? Also, how such a test
ensures that the streaming has happened because the way we are
checking results, won't it be the same for the non-streaming case as
well?

3.
+# Change the local values of the extra columns on the subscriber,
+# update publisher, and check that subscriber retains the expected
+# values
+$node_subscriber->safe_psql('postgres', "UPDATE test_tab SET c =
'epoch'::timestamptz + 987654321 * interval '1s'");
+$node_publisher->safe_psql('postgres', "UPDATE test_tab SET b = md5(a::text)");
+
+wait_for_caught_up($node_publisher, $appname);
+
+$result =
+ $node_subscriber->safe_psql('postgres', "SELECT count(*),
count(extract(epoch from c) = 987654321), count(d = 999) FROM
test_tab");
+is($result, qq(3334|3334|3334), 'check extra columns contain locally
changed data');

Again, how this test is relevant to streaming mode?

4. I have checked that in one of the previous patches, we have a test
v53-0004-Add-TAP-test-for-streaming-vs.-DDL which contains a test case
quite similar to what we have in
v55-0002-Add-support-for-streaming-to-built-in-logical-re/013_stream_subxact_ddl_abort.
If there is any difference that can cover more scenarios then can we
consider merging them into one test?

Apart from the above, I have made a few changes in the attached patch
which are mainly to simplify the code at one place, added/edited few
comments, some other cosmetic changes, and renamed the test case files
as the initials of their name were matching other tests in the similar
directory.

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
v56-0001-Fix-the-SharedFileSetUnregister-API.patch application/octet-stream 1.9 KB
v56-0002-Add-support-for-streaming-to-built-in-logical-re.patch application/octet-stream 97.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-08-31 07:56:06 Re: Switch to multi-inserts for pg_depend
Previous Message gkokolatos 2020-08-31 07:47:35 Re: v13: show extended stats target in \d