From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nitin Motiani <nitinmotiani(at)google(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: long-standing data loss bug in initial sync of logical replication |
Date: | 2025-02-24 09:49:49 |
Message-ID: | CAA4eK1KDPryBr-ewsW_Rkh5x0oiqpcOHAF3GAocDUrK2CV+atA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 11, 2024 at 12:37 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> I confirmed that the proposed patch fixes these issues. I have one
> question about the patch:
>
> In the main loop in SnapBuildDistributeSnapshotAndInval(), we have the
> following code:
>
> /*
> * If we don't have a base snapshot yet, there are no changes in this
> * transaction which in turn implies we don't yet need a snapshot at
> * all. We'll add a snapshot when the first change gets queued.
> *
> * NB: This works correctly even for subtransactions because
> * ReorderBufferAssignChild() takes care to transfer the base snapshot
> * to the top-level transaction, and while iterating the changequeue
> * we'll get the change from the subtxn.
> */
> if (!ReorderBufferXidHasBaseSnapshot(builder->reorder, txn->xid))
> continue;
>
> Is there any case where we need to distribute inval messages to
> transactions that don't have the base snapshot yet but eventually need
> the inval messages?
>
Good point. It is mentioned that for snapshots: "We'll add a snapshot
when the first change gets queued.". I think we achieve this via
builder->committed.xip array such that when we set a base snapshot for
a transaction, we use that array to form a snapshot. However, I don't
see any such consideration for invalidations. Now, we could either
always add invalidations to xacts that don't have base_snapshot yet or
have a mechanism similar committed.xid array. But it is better to
first reproduce the problem.
> Overall, with this idea, we distribute invalidation messages to all
> concurrent decoded transactions. It could introduce performance
> regressions by several causes. For example, we could end up
> invalidating RelationSyncCache entries in more cases. While this is
> addressed by your selectively cache invalidation patch, there is still
> 5% regression. We might need to accept a certain amount of regressions
> for making it correct but it would be better to figure out where these
> regressions come from. Other than that, I think the performance
> regression could happen due to the costs of distributing invalidation
> messages. You've already observed there is 1~3% performance regression
> in cases where we distribute a large amount of invalidation messages
> to one concurrently decoded transaction[1]. I guess that the
> selectively cache invalidation idea would not help this case. Also, I
> think we might want to test other cases like where we distribute a
> small amount of invalidation messages to many concurrently decoded
> transactions.
>
+1.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Jelte Fennema-Nio | 2025-02-24 09:56:23 | Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup |
Previous Message | Andres Freund | 2025-02-24 09:41:36 | Re: Remove wal_[sync|write][_time] from pg_stat_wal and track_wal_io_timing |