Re: pgsql: Do all-visible handling in lazy_vacuum_page() outside its critic

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: <pgsql-committers(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Do all-visible handling in lazy_vacuum_page() outside its critic
Date: 2014-06-26 07:39:01
Message-ID: 53ABCE15.70507@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 06/24/2014 01:27 AM, Andres Freund wrote:
> On 2014-06-23 13:00:11 -0700, Jeff Davis wrote:
>> On Fri, 2014-06-20 at 09:10 +0000, Andres Freund wrote:
>>> To fix, simply move all the all-visible handling outside the critical
>>> section. Doing so means that the PD_ALL_VISIBLE on the page won't be
>>> included in the full page image of the HEAP2_CLEAN record anymore. But
>>> that's fine, the flag will be set by the HEAP2_VISIBLE logged later.
>>
>> Trying to follow along. I read the discussion about bug #10533.
>>
>> Before 2a8e1ac5, a block of actions -- visibility test, VM test, set VM
>> bit, set all-visible -- were inside the critical section and after the
>> WAL insert.
>>
>> 2a8e1ac5 left the block of actions in the critical section but moved
>> them before the WAL insert. The commit message seems to indicate that
>> there's a real problem setting the all-visible flag after the WAL
>> insert, but I couldn't identify a serious problem there.
>
> Yes, that was confusing. Imo it made the state worse, not better. I'd
> asked for clarification somewhere... Heikki?

Hmm. I don't see any live bug caused before 2a8e1ac5 anymore either. I
think I missed the fact that replaying the XLOG_HEAP2_VISIBLE record
always sets the bit in the heap page.

> Does your change still make
> sense to you and do you see problem with the current state (as of ecac0e2b)?

Hmm, in the current state, it's again possible that the full-page image
doesn't contain the all-visible flag, even though the page in the buffer
does. I guess that's OK, because replaying XLOG_HEAP2_VISIBLE always
sets the flag. My page-comparison tool will complain, so it would be
nice to not do that, but it's a false alarm.

Or we could call heap_page_is_all_visible() before entering the critical
section, if we passed heap_page_is_all_visible() the list of dead tuples
we're removing so that it could ignore them.

>> I'm a little concerned that we're approaching the WAL rules in an ad-hoc
>> manner. I don't see an actual problem right now, but we've been fixing
>> problems with PD_ALL_VISIBLE since the VM was made crash safe almost
>> exactly three years ago (503c7305). Do we finally feel confident in the
>> design and its implications?
>
> I'm not particularly happy about all this either, but this section of
> code is actually much newer. It was added in fdf9e21196a6/9.3.

I'm not happy with the way we deviate from the normal WAL rules with in
the visibility map either. In the long run, we may be better off to just
take the hit and WAL-log the VM bits like any other update. Basically,
always behave as if checksums were enabled. But for now we just have to
fix bugs as they're found.

- Heikki

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Jeff Davis 2014-06-26 08:49:39 Re: pgsql: Do all-visible handling in lazy_vacuum_page() outside its critic
Previous Message Fujii Masao 2014-06-26 05:47:26 pgsql: Remove obsolete example of CSV log file name from log_filename d

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2014-06-26 07:50:20 Re: pgaudit - an auditing extension for PostgreSQL
Previous Message Fujii Masao 2014-06-26 07:24:48 Re: Cluster name in ps output