Re: Proposal for CSN based snapshots

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Ants Aasma <ants(at)cybertec(dot)at>, Bruce Momjian <bruce(at)momjian(dot)us>, obartunov <obartunov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Proposal for CSN based snapshots
Date: 2015-07-27 12:04:38
Message-ID: CAPpHfdv7BMwGv=OfUg3S-jGVFKqHi79pR_ZK1Wsk-13oZ+cy5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 25, 2015 at 11:39 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> On 24 July 2015 at 19:21, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> On Fri, Jul 24, 2015 at 1:00 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
>> wrote:
>> > It depends on the exact design we use to get that. Certainly we do not
>> want
>> > them if they cause a significant performance regression.
>>
>> Yeah. I think the performance worries expressed so far are:
>>
>> - Currently, if you see an XID that is between the XMIN and XMAX of
>> the snapshot, you hit CLOG only on first access. After that, the
>> tuple is hinted. With this approach, the hint bit doesn't avoid
>> needing to hit CLOG anymore, because it's not enough to know whether
>> or not the tuple committed; you have to know the CSN at which it
>> committed, which means you have to look that up in CLOG (or whatever
>> SLRU stores this data). Heikki mentioned adding some caching to
>> ameliorate this problem, but it sounds like he was worried that the
>> impact might still be significant.
>>
>
> This seems like the heart of the problem. Changing a snapshot from a list
> of xids into one number is easy. Making XidInMVCCSnapshot() work is the
> hard part because there needs to be a translation/lookup from CSN to
> determine if it contains the xid.
>
> That turns CSN into a reference to a cached snapshot, or a reference by
> which a snapshot can be derived on demand.
>

I got the problem. Currently, once we set hint bits don't have to visit
CLOG anymore. With CSN snapshots that is not so. We still have to translate
XID into CSN in order to compare it with snapshot CSN. In version of CSN
patch in this thread we still have XMIN and XMAX in the snapshot. AFAICS
with CSN snapshots XMIN and XMAX are not necessary required to express
snapshot, they were kept for optimization. That restricts usage of XID =>
CSN map with given range of XIDs. However, with long running transactions
[XMIN; XMAX] range could be very wide and we could use XID => CSN map
heavily in wide range of XIDs.

As I can see in Huawei PGCon talk "Dense Map" in shared memory is proposed
for XID => CSN transformation. Having large enough "Dense Map" we can do
most of XID => CSN transformations with single shared memory access. PGCon
talk gives us result of pgbench. However, pgbench doesn't run any long
transactions. With long running transaction we can run out of "Dense Map"
for significant part of XID => CSN transformations. Dilip, did you test
your patch with long transactions?

I'm also thinking about different option for optimizing this. When we set
hint bits we can also change XMIN/XMAX with CSN. In this case we wouldn't
need to do XID => CSN transformation for this tuple anymore. However, we
have 32-bit XIDs for now. We could also have 32-bit CSNs. However, that
would doubles our troubles with wraparound: we will have 2 counters that
could wraparound. That could return us to thoughts about 64-bit XIDs.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-07-27 12:09:54 Re: [DESIGN] ParallelAppend
Previous Message Pavel Stehule 2015-07-27 11:28:53 Re: raw output from copy