Re: removing PD_ALL_VISIBLE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: removing PD_ALL_VISIBLE
Date: 2013-05-30 13:47:22
Message-ID: CA+Tgmob4O4Gx4w0nCobXxe4Gzs+BdHbJ5MEKD+H_KCYZEpVo7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 30, 2013 at 8:12 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> As far as I understand the trick basically is that we can rely on an FPI
> being logged when an action unsetting ALL_VISIBLE is performed. That
> all-visible would then make sure the hint-bits marking indvidual tuples
> as frozen would hit disk. For that we need to add some more work though,
> consider:
>
> 1) write tuples on a page
> 2) "freeze" page by setting ALL_VISIBLE and setting hint
> bits. Setting ALL_VISIBLE is wall logged
> 3) crash
> 4) replay ALL_VISIBLE, set it on the page level. The individual tuples
> are *not* guaranteed to be marked frozen.
> 5) update tuple on the page unsetting all visible. Emits an FPI which
> does *not* have the tuples marked as frozen.
>
> Easy enough and fairly cheap to fix by having a function that checks
> that updates the hint bits on a page when unsetting all visible since we
> can just set it for all pre-existing tuples.

Basically, yes, though I would say "infomask bits" rather than "hint
bits", since not all of them are only hints, and this case would not
be merely a hint.

>> but as far as I can see, it also requires PD_ALL_VISIBLE to stick
>> around.
>
> Now, I am far from being convinced its a good idea to get rid of
> PD_ALL_VISIBLE, but I don't think it does. Except that it currently is
> legal for the page level ALL_VISIBLE being set while the corresponding
> visibilitymap one isn't there's not much prohibiting us fundamentally
> from looking in the vm when we need to know whether the page is all
> visible, is there?
> To the contrary, this actually seems to be a pretty good case for Jeff's
> proposed behaviour since it would allow freezing while only writing the
> vm?

Well, as Heikki points out, I think that's unacceptably dangerous.
Loss or corruption of a single visibility map page means possible loss
of half a gigabyte of data.

Also, if we go that route, looking at the visibility map is no longer
an optimization; it's essential for correctness. We can't decide to
skip it when it seems expensive, for example, as Jeff was proposing.

There's another thing that's bothering me about this whole discussion,
too. If looking at another page for the information we need to make
visibility decisions is so cheap that we need not be concerned with
it, then why do we need hint bits? I realize that it's not quite the
same thing, because CLOG doesn't have as much locality of access as
the visibility map; you're guaranteed to find all the information you
need for a single heap page on a single VM page. Also, CLOG is
per-tuple, not per-page, and we get a decent speed-up from checking
once for the whole page rather than for each tuple. Nonetheless, I
think the contrast between Jeff's tests, which aren't showing much
impact from the increased visibility map traffic, and previous
hint-bit removal tests, which have crashed and burned, may be caused
in part by the fact that our algorithms and locking regimen for
shared_buffers are much more sophisticated than for SLRU. I'm not
eager to have our design decisions driven by that gap.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-05-30 13:51:28 Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Previous Message Heikki Linnakangas 2013-05-30 13:33:50 Freezing without write I/O