Re: BUG #5915: OldSerXidAdd inflates pg_serial too much

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: drkp(at)csail(dot)mit(dot)edu, yamt(at)mwd(dot)biglobe(dot)ne(dot)jp, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5915: OldSerXidAdd inflates pg_serial too much
Date: 2011-03-08 10:23:47
Message-ID: 4D7603B3.4000102@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 06.03.2011 20:32, Kevin Grittner wrote:
> Heikki Linnakangas wrote:
>
>> Here's what I had in mind. Can you review
>
> The additions and modifications to the comments all look good to me.
> I can see why you renamed one field and eliminated another; no
> problem there. I'm really surprised, when you ignore those changes,
> how little was needed to move this to checkpoint time. I hadn't
> looked into that, and expected it to be much harder to do, even
> though I should know better by now. :-)

Ok, committed, so that we get it into the alpha4 release that's wrapped
soon.

> I looked it over and pondered a bit, and have three concerns.
>
> (1) Why is it safe to assume that checkpoint will occur before this
> wraps around? Is it just that it takes a billion transactions, and
> one can reasonably expect a checkpoint before reaching that point, or
> is there some other safety net?

Hmm, good question. The maximum checkpoint interval is 30 minutes, so
you would need a sustained transaction rate of over 500000 transactions
per second to bump into that.

If it happens, I don't think there's any harm done. OldSerXidAdd()
initializes any new pages to zero, and the truncation code should never
truncate away files that are still in use.

> (2) What if there are only occassional clusters of serializable
> transactions? If the one SLRU page is held so that it isn't
> repeatedly truncated and re-zeroed, does it pose a risk if
> transaction IDs wrap around while that page is held? (I don't think
> this is a new problem with the proposed patch, it just made it more
> obvious.) Should we be using RecentGlobalXmin or something in that
> checkpoint function to truncate that last page when it is safe to do
> so?

I didn't quite understand this, but I believe the fact that
OldSerXidAdd() initializes any new pages to zero protects from this too.

> (3) That unnecessary SLRU flush should probably be conditioned on a
> #ifdef or some not-very-visible GUC or something. With the problems
> which some people have with writes glutting the I/O during checkpoint
> I'd hate to add to the writes at checkpoint time just for debugging
> info -- at least without a specific request for that behavior.

We have similar unnecessary SLRU flush in CheckpointSUBTRANS too, so I
don't think that's necessary.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2011-03-08 11:22:20 Re: BUG #5918: SummarizeOldestCommittedSxact assertion failure
Previous Message Dave Page 2011-03-08 06:41:33 Re: BUG #5899: Memory corruption when running psql