Re: Proposal for CSN based snapshots

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Ants Aasma <ants(at)cybertec(dot)at>, Bruce Momjian <bruce(at)momjian(dot)us>, obartunov <obartunov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Proposal for CSN based snapshots
Date: 2016-08-09 12:16:55
Message-ID: 69041c5a-ec60-3c54-49d0-5df071b27f52@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(Reviving an old thread)

I spent some time dusting off this old patch, to implement CSN
snapshots. Attached is a new patch, rebased over current master, and
with tons of comments etc. cleaned up. There's more work to be done
here, I'm posting this to let people know I'm working on this again. And
to have a backup on the 'net :-).

I switched to using a separate counter for CSNs. CSN is no longer the
same as the commit WAL record's LSN. While I liked the conceptual
simplicity of CSN == LSN a lot, and the fact that the standby would see
the same commit order as master, I couldn't figure out how to make async
commits to work.

Next steps:

* Hot standby feedback is broken, now that CSN != LSN again. Will have
to switch this back to using an "oldest XID", rather than a CSN.

* I plan to replace pg_subtrans with a special range of CSNs in the
csnlog. Something like, start the CSN counter at 2^32 + 1, and use CSNs
< 2^32 to mean "this is a subtransaction, parent is XXX". One less SLRU
to maintain.

* Put per-proc xmin back into procarray. I removed it, because it's not
necessary for snapshots or GetOldestSnapshot() (which replaces
GetOldestXmin()) anymore. But on second thoughts, we still need it for
deciding when it's safe to truncate the csnlog.

* In this patch, HeapTupleSatisfiesVacuum() is rewritten to use an
"oldest CSN", instead of "oldest xmin", but that's not strictly
necessary. To limit the size of the patch, I might revert those changes
for now.

* Rewrite the way RecentGlobalXmin is updated. As Alvaro pointed out in
his review comments two years ago, that was quite complicated. And I'm
worried that the lazy scheme I had might not allow pruning fast enough.
I plan to make it more aggressive, so that whenever the currently oldest
transaction finishes, it's responsible for advancing the "global xmin"
in shared memory. And the way it does that, is by scanning the csnlog,
starting from the current "global xmin", until the next still
in-progress XID. That could be a lot, if you have a very long-running
transaction that ends, but we'll see how it performs.

* Performance testing. Clearly this should have a performance benefit,
at least under some workloads, to be worthwhile. And not regress.

- Heikki

Attachment Content-Type Size
csn-3.patch.gz application/gzip 119.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shay Rojansky 2016-08-09 12:59:25 Re: Slowness of extended protocol
Previous Message Bruce Momjian 2016-08-09 11:45:11 Re: Wait events monitoring future development