Re: Write Ahead Logging for Hash Indexes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Write Ahead Logging for Hash Indexes
Date: 2017-03-10 12:51:15
Message-ID: CA+TgmobPn5o9q9_z2gpBYNnBJ9z+-EvaCrZiN74gbKxLCbkoPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 10, 2017 at 7:08 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Mar 10, 2017 at 8:49 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Mar 9, 2017 at 9:34 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> Do we really need to set LSN on this page (or mark it dirty), if so
>>> why? Are you worried about restoration of FPI or something else?
>>
>> I haven't thought through all of the possible consequences and am a
>> bit to tired to do so just now, but doesn't it seem rather risky to
>> invent a whole new way of using these xlog functions?
>> src/backend/access/transam/README describes how to do write-ahead
>> logging properly, and neither MarkBufferDirty() nor PageSetLSN() is
>> described as an optional step.
>
> Just to salvage my point, I think this is not the first place where we
> register buffer, but don't set lsn. For XLOG_HEAP2_VISIBLE, we
> register heap and vm buffers but set the LSN conditionally on heap
> buffer. Having said that, I see the value of your point and I am open
> to doing it that way if you feel that is a better way.

Right, we did that, and it seems to have worked. But it was a scary
exception that required a lot of thought. Now, since we did it once,
we could do it again, but I am not sure it is for the best. In the
case of vacuum, we knew that a vacuum on a large table could otherwise
emit an FPI for every page, which would almost double the amount of
write I/O generated by a vacuum - instead of WAL records + heap pages,
you'd be writing FPIs + heap pages, a big increase. Now here I think
that the amount of write I/O that it's costing us is not so clear.
Unless it's going to be really big, I'd rather do this in some way we
can think is definitely safe.

Also, if we want to avoid dirtying the primary bucket page by setting
the LSN, IMHO the way to do that is to store the block number of the
primary bucket page without registering the buffer, and then during
recovery, lock that block. That seems cleaner than hoping that we can
take an FPI without setting the page LSN.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2017-03-10 12:59:31 Re: Gather Merge
Previous Message Pavel Stehule 2017-03-10 12:49:34 Re: New CORRESPONDING clause design