Re: Allow "snapshot too old" error, to prevent bloat

From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow "snapshot too old" error, to prevent bloat
Date: 2015-02-19 00:38:42
Message-ID: CAM-w4HMfDgPGEB77jU=R1m0orPAk4bNNqpqXGQ3Rk+steq0=fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 15, 2015 at 8:27 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> There might be something in that, but again it's not much like this patch.
> The key point I think we're both making is that nondeterministic failures
> are bad, especially when you're talking about long-running, expensive-to-
> retry transactions.
>

But I think Kevin would agree with you there. That's why he's interested in
having the errors not occur if you don't read from the volatile tables. Ie,
your reporting query reading from read-only tables would be guaranteed to
succeed even if some other table had had some rows vacuumed away.

I'm not sure that's really going to work because of things like hint bit
updates or hot pruning. That is, say your table is read-only now but last
week some of the records were updated. You might reasonably expect those
rows to be safe and indeed the rows themselves will still be in your
report. But the earlier versions of the rows could still be sitting on some
pages invisible to every query but not vacuumable quite yet and then
suddenly they get vacuumed away and the LSN on the page gets bumped.

Fwiw I would strongly suggest that instead of having a number of
transactions to have a time difference. I haven't been following the
"commit timestamp" thread but I'm hoping that patch has the infrastructure
needed to look up an lsn and find out the timestamp and vice versa. So when
you take a snapshot you could look up the effective cut-off LSN based on
the time difference specified in the GUC. As a DBA I would find it
infinitely more comforting specifying "old_snapshot_threshold=4h" than
specifying some arbitrary large number of transactions and hoping I
calculated the transaction rate reasonably accurately.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2015-02-19 01:14:02 Re: Allow "snapshot too old" error, to prevent bloat
Previous Message Peter Geoghegan 2015-02-19 00:35:14 INSERT ... ON CONFLICT UPDATE and logical decoding