Re: Proposal for CSN based snapshots

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Ants Aasma <ants(at)cybertec(dot)at>, Bruce Momjian <bruce(at)momjian(dot)us>, obartunov <obartunov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)postgrespro(dot)ru>
Subject: Re: Proposal for CSN based snapshots
Date: 2016-08-22 16:35:19
Message-ID: 389b45d1-f4f8-a333-03ef-efb94c8d6087@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

And here's a new patch version. Still lots of work to do, especially in
performance testing, and minimizing the worst-case performance hit.

On 08/09/2016 03:16 PM, Heikki Linnakangas wrote:
> Next steps:
>
> * Hot standby feedback is broken, now that CSN != LSN again. Will have
> to switch this back to using an "oldest XID", rather than a CSN.
>
> * I plan to replace pg_subtrans with a special range of CSNs in the
> csnlog. Something like, start the CSN counter at 2^32 + 1, and use CSNs
> < 2^32 to mean "this is a subtransaction, parent is XXX". One less SLRU
> to maintain.
>
> * Put per-proc xmin back into procarray. I removed it, because it's not
> necessary for snapshots or GetOldestSnapshot() (which replaces
> GetOldestXmin()) anymore. But on second thoughts, we still need it for
> deciding when it's safe to truncate the csnlog.
>
> * In this patch, HeapTupleSatisfiesVacuum() is rewritten to use an
> "oldest CSN", instead of "oldest xmin", but that's not strictly
> necessary. To limit the size of the patch, I might revert those changes
> for now.

I did all of the above. This patch is now much smaller, as I didn't
change all the places that used to deal with global-xmin's, like I did
earlier. The oldest-xmin is now computed pretty like it always has been.

> * Rewrite the way RecentGlobalXmin is updated. As Alvaro pointed out in
> his review comments two years ago, that was quite complicated. And I'm
> worried that the lazy scheme I had might not allow pruning fast enough.
> I plan to make it more aggressive, so that whenever the currently oldest
> transaction finishes, it's responsible for advancing the "global xmin"
> in shared memory. And the way it does that, is by scanning the csnlog,
> starting from the current "global xmin", until the next still
> in-progress XID. That could be a lot, if you have a very long-running
> transaction that ends, but we'll see how it performs.

I ripped out all that, and created a GetRecentGlobalXmin() function that
computes a global-xmin value when needed, like GetOldestXmin() does.
Seems most straightforward. Since we no longer get a RecentGlobalXmin
value essentially for free in GetSnapshotData(), as we no longer scan
the proc array, it's better to compute the value only when needed.

> * Performance testing. Clearly this should have a performance benefit,
> at least under some workloads, to be worthwhile. And not regress.

I wrote a little C module to create a "worst-case" table. Every row in
the table has a different xmin, and the xmin values are shuffled across
the table, to defeat any caching.

A sequential scan of a table like that with 10 million rows took about
700 ms on my laptop, when the hint bits are set, without this patch.
With this patch, if there's a snapshot holding back the xmin horizon, so
that we need to check the CSN log for every XID, it took about 30000 ms.
So we have some optimization work to do :-). I'm not overly worried
about that right now, as I think there's a lot of room for improvement
in the SLRU code. But that's the next thing I'm going to work.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2016-08-22 16:36:43 Re: Proposal for CSN based snapshots
Previous Message Robert Haas 2016-08-22 16:19:44 Re: Showing parallel status in \df+