| From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ants Aasma <ants(dot)aasma(at)cybertec(dot)at>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> |
| Subject: | PGConf.dev CSN unconference session: notes and follow-up discussion takeaways |
| Date: | 2026-05-27 08:56:57 |
| Message-ID: | CAEze2WiCsHGTjrBFAvTJ4QfZKZtdb95tXTcT2_y4_t_kQ+q96Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
First, I want to inform you that I've added the notes I took in the CSN
unconference session to the PGConf.dev 2026 unconference wiki page [0].
They're in a rough shape as I was unable to both write notes and
participate at the same time; so some parts of the conversation are
missing. I invite anyone with more notes (or better memory than mine)
to add any missing parts.
Second, I'd like to share a few takeaway items from the CSN session and
subsequent hallway track discussions, as possible start for further
discussions:
1. The primary source of complications in CSN (and snapshotting in
general) is generally agreed upon to be *visibility semantics* vs
*durability semantics*, primarily seen through the synchronous_commit
setting.
2. Visibility of s_c=off commits ("async commits") is immediate, but
some users with with s_c=on ("sync commits") or s_c=remote_{read|write}
("remote commits") may not want to see not-yet-durable async commits.
2a. It was suggested to allow sync-commit's sessions to wait for such
async commits' commit LSN to become sufficiently durable if they need
to read those async commits' data.
2b. It was also suggested to make async commits wait for durability
of sync commits' CSN [^1]. A counterpoint to this would be that it'd be
a heavy penalty for async commits that need to read data that has
recently been modified by non-async commits.
3. There was no clearly articulated consensus that it is necessary for
the CSN work to fix our Long Fork [2] issue [3] (different visibility
order between primary and replica). See also point 5 and 6.
4. The primary consensus from the session seems to be that commit-record
LSN would work as a natural CSN on replicas; and that it won't change
current replica visibility semantics.
5. Not everyone agreed that the LSN of commit records as such is
sufficient as CSN for primaries:
5a. Visibility order of sync commits vs async commits is the primary
issue here; a session with only async commits is able to handle any
amount of transactions whilst another session with s_c=remote_apply
("remote commit") may take forever to get confirmed and become visible.
5b. A suggested solution to visibility ordering issues was to log a
'commit visible' record for transactions whose COMMIT record has reached
its durability requirement, and use that record as CSN. This record
could be shared by multiple commits, in a way that's similar to how
commit_delay/commit_siblings combine WAL fsyncs, to limit the net new
WAL generation per commit, and would be optional (or, implied) when the
primary is lost before the visibility record is logged. This new
'commit visible' record would be comparable to 2PC, with as main
differentiator that it would not allow rollbacks, and that every
committed not-yet-visible transaction would automatically become visible
once recovery ends/when a standby promotes.
6. It was noted that it is not even strictly necessary to use LSNs as
CSN on the primary:
6a. A local in-memory counter could be used to generate the (unlogged)
CSNs only when visibility is achieved. This would allow us to implement
visibility semantics on the primary that behave equivalent to its
current behaviour. Whilst this wouldn't solve the Long Fork issue, it
would enable the benefits of CSN snapshots on the primary.
6b. It was mentioned that this approach could take more effort than
just using LSN-based CSNs.
I hope this has been informative and can help move discussions about
this feature forward.
Kind regards,
Matthias van de Meent
Databricks (https://www.databricks.com)
[0]: https://wiki.postgresql.org/wiki/PGConf.dev_2026_Developer_Unconference#Commit_Sequence_Numbers
[^1]: In a world where the current WAL insert pointer is used to
construct a snapshot, and every commit only needs to log a single record
to become visible.
[2]: https://jepsen.io/consistency/phenomena/long-fork
[3]: https://jepsen.io/analyses/amazon-rds-for-postgresql-17.4
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | vignesh C | 2026-05-27 08:34:37 | Re: Proposal: Conflict log history table for Logical Replication |