Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Önder Kalacı <onderkalaci(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-13 11:48:13
Message-ID: CAA4eK1KBKtsi8zv-OhZd4XR3qcqHXvkyBwW_gZUSNRqwZTghfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 13, 2023 at 2:44 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Sat, Mar 11, 2023 at 6:05 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Mar 10, 2023 at 7:58 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
> > >
> > >
> > > I think one option could be to drop some cases altogether, but not sure we'd want that.
> > >
> > > As a semi-related question: Are you aware of any setting that'd make pg_stat_all_indexes
> > > reflect the changes sooner? It is hard to debug what is the bottleneck in the tests, but
> > > I have a suspicion that there might be several poll_query_until() calls on
> > > pg_stat_all_indexes, which might be the reason?
> > >
> >
> > Yeah, I also think poll_query_until() calls on pg_stat_all_indexes is
> > the main reason for these tests taking more time. When I commented
> > those polls, it drastically reduces the test time. On looking at
> > pgstat_report_stat(), it seems we don't report stats sooner than 1s
> > and as most of this patch's test relies on stats, it leads to taking
> > more time. I don't have a better idea to verify this patch without
> > checking whether the index scan is really used by referring to
> > pg_stat_all_indexes. I think trying to reduce the poll calls may help
> > in reducing the test timings further. Some ideas on those lines are as
> > follows:
>
> If the reason for the stats polling was only to know if some index is
> chosen or not, I was wondering if you can just convey the same
> information to the TAP test via some conveniently placed (DEBUG?)
> logging.
>

I had thought about it but didn't convince myself that it would be a
better approach because it would LOG a lot of messages for bulk
updates/deletes. Note for each row update on the publisher a new
index/sequence scan will happen. So, instead, I tried to further
change the test cases to remove unnecessary parts. I have changed
below tests:

1.
+# subscriber gets the missing table information
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub_rep_full REFRESH PUBLICATION");

This and the follow-on test was not required after we have removed
Dropped columns test.

2. Reduce the number of updates/deletes in the first test to two rows.

3. Removed the cases for dropping the index. This ensures that after
dropping the index on the table we switch to either an index scan (if
a new index is created) or to a sequence scan. It doesn't seem like a
very interesting case to me.

Apart from the above, I have removed the explicit setting of
'wal_retrieve_retry_interval = 1ms' as the same is not done for any
other subscription tests. I know setting wal_retrieve_retry_interval
avoids the launcher sometimes taking more time to launch apply worker
but it is better to be consistent. See the changes in
changes_amit_1.patch, if you agree with the same then please include
them in the next version.

After doing the above, the test time on my machine is closer to what
other tests take which is ~5s.

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
v45_0001_use_index_on_subs_when_pub_rep_ident_full.patch application/octet-stream 47.5 KB
changes_amit_1.patch application/octet-stream 8.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 'Sandro Santilli' 2023-03-13 11:59:16 Re: Ability to reference other extensions by schema in extension scripts
Previous Message shiy.fnst@fujitsu.com 2023-03-13 10:53:03 RE: Dropped and generated columns might cause wrong data on subs when REPLICA IDENTITY FULL