| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: Proposal: Conflict log history table for Logical Replication |
| Date: | 2025-12-03 04:19:10 |
| Message-ID: | CAJpy0uDe724nY59j-8hMapZ_Fru1Wo-NucF4Ea1B3Jrw=+J+UQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Dec 2, 2025 at 8:40 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Dec 2, 2025 at 4:45 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > > >
> > > > >
> > > > > Okay, I will try to make multiple local rows as JSON Array in the next version.
> > > > >
> > > > Just to clarify so that we are on the same page, along with the local
> > > > tuple the other local fields like local_xid, local_commit_ts,
> > > > local_origin will also be converted into the array. Hope that makes
> > > > sense?
> > > >
> > >
> > > Yes, what about key_tuple or RI?
> > >
> > > > So we will change the table like this, not sure if this makes sense to
> > > > keep all local array fields nearby in the table, or let it be near the
> > > > respective remote field, like we are doing now remote_xid and local
> > > > xid together etc.
> > > >
> > >
> > > It is better to keep the array fields together at the end. I think it
> > > would be better to read via CLI. Also, it may take more space due to
> > > padding/alignment if we store fixed-width and variable-width columns
> > > interleaved and similarly the access will also be slower for
> > > interleaved cases.
> > >
> > > Having said that, can we consider an alternative way to store all
> > > local_conflict_info together as a JSONB column (that can be used to
> > > store an array of objects). For example, the multiple conflicting
> > > tuple information can be stored as:
> > >
> > > [
> > > { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin":
> > > "node_A", "tuple": { "id": 1, "email": "a(at)b(dot)com" } },
> > > { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin":
> > > "node_B", "tuple": { "id": 2, "phone": "555-0199" } }
> > > ]
> > >
> > > To access JSON array columns, I think one needs to use the unnest
> > > function, whereas JSONB could be accessed with something like: "SELECT
> > > * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]".
> >
> > Yeah we can do that as well, maybe that's a better idea compared to
> > creating separate array fields for each local element.
>
> So I tried the POC idea with this approach and tested with one of the
> test cases given by Shveta, and now the conflict log table entry looks
> like this. So we can see the local conflicts field which is an array
> of JSON and each entry of the array is formed using (xid, commit_ts,
> origin, json tuple). I will send the updated patch by tomorrow after
> doing some more cleanup and testing.
>
> relid | 16391
> schemaname | public
> relname | conf_tab
> conflict_type | multiple_unique_conflicts
> remote_xid | 761
> remote_commit_lsn | 0/01761400
> remote_commit_ts | 2025-12-02 15:02:07.045935+00
> remote_origin | pg_16406
> key_tuple |
> remote_tuple | {"a":2,"b":3,"c":4}
> local_conflicts |
> {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
>
Thanks, it looks good. For the benefit of others, could you include a
brief note, perhaps in the commit message for now, describing how to
access or read this array column? We can remove it later.
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pavel Stehule | 2025-12-03 04:27:04 | Re: proposal: schema variables |
| Previous Message | Michael Paquier | 2025-12-03 04:16:38 | Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update) |