Re: cheaper snapshots redux

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: cheaper snapshots redux
Date: 2011-08-23 16:13:13
Message-ID: 29474.1314115993@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> With respect to the first problem, what I'm imagining is that we not
> do a complete rewrite of the snapshot in shared memory on every
> commit. Instead, when a transaction ends, we'll decide whether to (a)
> write a new snapshot or (b) just record the XIDs that ended. If we do
> (b), then any backend that wants a snapshot will need to copy from
> shared memory both the most recently written snapshot and the XIDs
> that have subsequently ended. From there, it can figure out which
> XIDs are still running. Of course, if the list of recently-ended XIDs
> gets too long, then taking a snapshot will start to get expensive, so
> we'll need to periodically do (a) instead. There are other ways that
> this could be done as well; for example, the KnownAssignedXids stuff
> just flags XIDs that should be ignored and then periodically compacts
> away the ignored entries.

I'm a bit concerned that this approach is trying to optimize the heavy
contention situation at the cost of actually making things worse anytime
that you're not bottlenecked by contention for access to this shared
data structure. In particular, given the above design, then every
reader of the data structure has to duplicate the work of eliminating
subsequently-ended XIDs from the latest stored snapshot. Maybe that's
relatively cheap, but if you do it N times it's not going to be so cheap
anymore. In fact, it looks to me like that cost would scale about as
O(N^2) in the number of transactions you allow to elapse before storing
a new snapshot, so you're not going to be able to let very many go by
before you do that.

I don't say this can't be made to work, but I don't want to blow off
performance for single-threaded applications in pursuit of scalability
that will only benefit people running massively parallel applications
on big iron.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-23 16:15:23 Re: FATAL: lock AccessShareLock on object 0/1260/0 is already held
Previous Message Tom Lane 2011-08-23 15:44:28 Re: Deferred Snapshots