| From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Proposal: Conflict log history table for Logical Replication |
| Date: | 2025-12-01 09:41:56 |
| Message-ID: | CAA4eK1+tW8_LiTt1ZCGpH06fq4SpyUaduqtapAT1PUHVKBGrxg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Dec 1, 2025 at 2:58 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > Since there is a concern that multiple rows for
> > > multiple_unique_conflicts can cause data-bloat, it made me rethink
> > > that this is actually more prone to causing data-bloat if it is not
> > > resolved on time, as it seems a far more frequent scenario. So shall
> > > we keep inserting the record or insert it once and avoid inserting it
> > > again based on lsn? Thoughts?
> >
> > I agree, this is the real problem related to bloat so maybe we can see
> > if the same tuple exists we can avoid inserting it again, although I
> > haven't put thought on how to we distinguish between the new conflict
> > on the same row vs the same conflict being inserted multiple times due
> > to worker restart.
> >
>
> If there is consensus on this approach, IMO, it appears safe to rely
> on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for
> the given 'conflict_type' before we insert a new record.
>
What happens if as part of multiple_unique_conflict, in the next apply
round only some of the rows conflict (say in the meantime user has
removed a few conflicting rows)? I think the ideal way for users to
avoid such multiple occurrences is to configure subscription with
disable_on_error. I think we should LOG errors again on retry and it
is better to keep it consistent with what we print in LOG because we
may want to give an option to users in future where to LOG (in
conflict_history_table, LOG, or both) the conflicts.
--
With Regards,
Amit Kapila.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2025-12-01 10:06:36 | Re: Migrate to autoconf 2.72? |
| Previous Message | shveta malik | 2025-12-01 09:27:53 | Re: Proposal: Conflict log history table for Logical Replication |