Re: Write Ahead Logging for Hash Indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Write Ahead Logging for Hash Indexes
Date: 2017-03-10 13:02:38
Message-ID: CAA4eK1+=Zyn4+EY_Ee+rEQWqinrRwU-7kDXCvkS4qU7=OSrYmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 10, 2017 at 6:21 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Mar 10, 2017 at 7:08 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Fri, Mar 10, 2017 at 8:49 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Thu, Mar 9, 2017 at 9:34 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>>> Do we really need to set LSN on this page (or mark it dirty), if so
>>>> why? Are you worried about restoration of FPI or something else?
>>>
>>> I haven't thought through all of the possible consequences and am a
>>> bit to tired to do so just now, but doesn't it seem rather risky to
>>> invent a whole new way of using these xlog functions?
>>> src/backend/access/transam/README describes how to do write-ahead
>>> logging properly, and neither MarkBufferDirty() nor PageSetLSN() is
>>> described as an optional step.
>>
>> Just to salvage my point, I think this is not the first place where we
>> register buffer, but don't set lsn. For XLOG_HEAP2_VISIBLE, we
>> register heap and vm buffers but set the LSN conditionally on heap
>> buffer. Having said that, I see the value of your point and I am open
>> to doing it that way if you feel that is a better way.
>
> Right, we did that, and it seems to have worked. But it was a scary
> exception that required a lot of thought. Now, since we did it once,
> we could do it again, but I am not sure it is for the best. In the
> case of vacuum, we knew that a vacuum on a large table could otherwise
> emit an FPI for every page, which would almost double the amount of
> write I/O generated by a vacuum - instead of WAL records + heap pages,
> you'd be writing FPIs + heap pages, a big increase. Now here I think
> that the amount of write I/O that it's costing us is not so clear.
> Unless it's going to be really big, I'd rather do this in some way we
> can think is definitely safe.
>
> Also, if we want to avoid dirtying the primary bucket page by setting
> the LSN, IMHO the way to do that is to store the block number of the
> primary bucket page without registering the buffer, and then during
> recovery, lock that block. That seems cleaner than hoping that we can
> take an FPI without setting the page LSN.
>

I was thinking that we will use REGBUF_NO_IMAGE flag as is used in
XLOG_HEAP2_VISIBLE record for heap buffer, that will avoid any extra
I/O and will make it safe as well. I think that makes registering the
buffer safe without setting LSN for XLOG_HEAP2_VISIBLE record.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rahila Syed 2017-03-10 13:04:44 Re: Adding support for Default partition in partitioning
Previous Message Ashutosh Sharma 2017-03-10 13:01:13 Re: Should we cacheline align PGXACT?