Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2022-08-13 05:10:50
Message-ID: CAA4eK1J3fFvQUGoOPCsyUxwpXMBEA007CnEJd3qHw8jJngKq8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 8, 2022 at 9:29 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
>
> Hi,
>
> Thanks for the feedback, see my reply below.
>
>> >
>> > It turns out that we already invalidate the relevant entries in LogicalRepRelMap/LogicalRepPartMap when "ANALYZE" (or VACUUM) updates any of the statistics in pg_class.
>> >
>> > The call-stack for analyze is roughly:
>> > do_analyze_rel()
>> > -> vac_update_relstats()
>> > -> heap_inplace_update()
>> > -> if needs to apply any statistical change
>> > -> CacheInvalidateHeapTuple()
>> >
>>
>> Yeah, it appears that this will work but I see that we don't update
>> here for inherited stats, how does it work for such cases?
>
>
> There, the expansion of the relation list to partitions happens one level above on the call stack. So, the call stack looks like the following:
>
> autovacuum_do_vac_analyze() (or ExecVacuum)
> -> vacuum()
> -> expand_vacuum_rel()
> -> rel_list=parent+children partitions
> -> for rel in rel_list
> ->analyze_rel()
> ->do_analyze_rel
> ... (and the same call stack as above)
>
> I also added one variation of a similar test for partitioned tables, which I earlier added for non-partitioned tables as well:
>
>> Added a test which triggers this behavior. The test is as follows:
>> - Create two indexes on the target, on column_a and column_b
>> - Initially load data such that the column_a has a high cardinality
>> - Show that we use the index on column_a on a child table
>> - Load more data such that the column_b has higher cardinality
>> - ANALYZE on the parent table
>> - Show that we use the index on column_b afterwards on the child table
>
>
> My answer for the above assumes that your question is regarding what happens if you ANALYZE on a partitioned table. If your question is something different, please let me know.
>

I was talking about inheritance cases, something like:
create table tbl1 (a int);
create table tbl1_part1 (b int) inherits (tbl1);
create table tbl1_part2 (c int) inherits (tbl1);

What we do in such cases is documented as: "if the table being
analyzed has inheritance children, ANALYZE gathers two sets of
statistics: one on the rows of the parent table only, and a second
including rows of both the parent table and all of its children. This
second set of statistics is needed when planning queries that process
the inheritance tree as a whole. The child tables themselves are not
individually analyzed in this case."

Now, the point I was worried about was what if the changes in child
tables (*_part1, *_part2) are much more than in tbl1? In such cases,
we may not invalidate child rel entries, so how will logical
replication behave for updates/deletes on child tables? There may not
be any problem here but it is better to do some analysis of such cases
to see how it behaves.

>>
>> >> BTW, do we want to consider partial indexes for the scan in this
>> >> context? I mean it may not have data of all rows so how that would be
>> >> usable?
>> >>
>> >
>
>> Few other comments:
>> ==================
>> 1. Why is it a good idea to choose the index selected even for the
>> bitmap path (T_BitmapHeapScan or T_BitmapIndexScan)? We use index scan
>> during update/delete, so not sure how we can conclude to use index for
>> bitmap paths.
>
>
> In our case, during update/delete we are searching for a single tuple on the target. And, it seems like using an index is probably going to be cheaper for finding the single tuple. In general, I thought we should use an index if the planner ever decides to use it with the given restrictions.
>

What about the case where the index has a lot of duplicate values? We
may need to retrieve multiple tuples in such cases.

> Also, for the majority of the use-cases, I think we'd probably expect an index on a column with high cardinality -- hence use index scan. So, bitmap index scans are probably not going to be that much common.
>

You are probably right here but I don't think we can make such
assumptions. I think the safest way to avoid any regression here is to
choose an index when the planner selects an index scan. We can always
extend it later to bitmap scans if required. We can add a comment
indicating the same.

> Still, I don't see a problem with using such indexes. Of course, it is possible that I might be missing something. Do you have any specific concerns in this area?
>
>>
>>
>> 2. The index info is built even on insert, so workload, where there
>> are no updates/deletes or those are not published then this index
>> selection work will go waste. Will it be better to do it at first
>> update/delete? One can say that it is not worth the hassle as anyway
>> it will be built the first time we perform an operation on the
>> relation or after the relation gets invalidated.
>
>
> With the current approach, the index (re)-calculation is coupled with (in)validation of the relevant cache entries. So, I'd argue for the simplicity of the code, we could afford to waste this small overhead? According to my local measurements, especially for large tables, the index oid calculation is mostly insignificant compared to the rest of the steps. Does that sound OK to you?
>
>
>>
>> If we think so, then
>> probably adding a comment could be useful.
>
>
> Yes, that is useful if you are OK with the above, added.
>

*
+ /*
+ * For insert-only workloads, calculating the index is not necessary.
+ * As the calculation is not expensive, we are fine to do here (instead
+ * of during first update/delete processing).
+ */

I think here instead of talking about cost, we should mention that it
is quite an infrequent operation i.e performed only when we first time
performs an operation on the relation or after invalidation. This is
because I think the cost is relative.

*
+
+ /*
+ * Although currently it is not possible for planner to pick a
+ * partial index or indexes only on expressions,

It may be better to expand this comment by describing a bit why it is
not possible in our case. You might want to give the function
reference where it is decided.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2022-08-13 08:39:06 Re: build remaining Flex files standalone
Previous Message Junwang Zhao 2022-08-13 01:03:29 Re: fix stale help message