Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2022-09-06 11:13:14
Message-ID: CAA4eK1KQMH1U=OpAKSx3GE=3jXCtE=p2JMXTp=46QbU9nMV8ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 20, 2022 at 4:32 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
>
> I'm a little late to catch up with your comments, but here are my replies:
>
>> > My answer for the above assumes that your question is regarding what happens if you ANALYZE on a partitioned table. If your question is something different, please let me know.
>> >
>>
>> I was talking about inheritance cases, something like:
>> create table tbl1 (a int);
>> create table tbl1_part1 (b int) inherits (tbl1);
>> create table tbl1_part2 (c int) inherits (tbl1);
>>
>> What we do in such cases is documented as: "if the table being
>> analyzed has inheritance children, ANALYZE gathers two sets of
>> statistics: one on the rows of the parent table only, and a second
>> including rows of both the parent table and all of its children. This
>> second set of statistics is needed when planning queries that process
>> the inheritance tree as a whole. The child tables themselves are not
>> individually analyzed in this case."
>
>
> Oh, I haven't considered inherited tables. That seems right, the statistics of the children are not updated when the parent is analyzed.
>
>>
>> Now, the point I was worried about was what if the changes in child
>> tables (*_part1, *_part2) are much more than in tbl1? In such cases,
>> we may not invalidate child rel entries, so how will logical
>> replication behave for updates/deletes on child tables? There may not
>> be any problem here but it is better to do some analysis of such cases
>> to see how it behaves.
>
>
> I also haven't observed any specific issues. In the end, when the user (or autovacuum) does ANALYZE on the child, it is when the statistics are updated for the child.
>

Right, I also think that should be the behavior but I have not
verified it. However, I think it should be easy to verify if
autovacuum updates the stats for child tables when we operate on only
one of such tables and whether that will invalidate the cache for our
case.

> Although I do not have much experience with inherited tables, this sounds like the expected behavior?
>
> I also pushed a test covering inherited tables. First, a basic test on the parent. Then, show that updates on the parent can also use indexes of the children. Also, after an ANALYZE on the child, we can re-calculate the index and use the index with a higher cardinality column.
>
>>
>> > Also, for the majority of the use-cases, I think we'd probably expect an index on a column with high cardinality -- hence use index scan. So, bitmap index scans are probably not going to be that much common.
>> >
>>
>> You are probably right here but I don't think we can make such
>> assumptions. I think the safest way to avoid any regression here is to
>> choose an index when the planner selects an index scan. We can always
>> extend it later to bitmap scans if required. We can add a comment
>> indicating the same.
>>
>
> Alright, I got rid of the bitmap scans.
>
> Though, it caused few of the new tests to fail. I think because of the data size/distribution, the planner picks bitmap scans. To make the tests consistent and small, I added `enable_bitmapscan to off` for this new test file. Does that sound ok to you? Or, should we change the tests to make sure they genuinely use index scans?
>

That sounds okay to me.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2022-09-06 11:25:22 Re: Fix possible bogus array out of bonds (src/backend/access/brin/brin_minmax_multi.c)
Previous Message Anton A. Melnikov 2022-09-06 11:02:53 May be BUG. Periodic burst growth of the checkpoint_req counter on replica.