Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: David Steele <david(at)pgmasters(dot)net>, a(dot)lubennikova(at)postgrespro(dot)ru, Michael Paquier <michael(at)paquier(dot)xyz>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist
Date: 2019-04-03 14:07:00
Message-ID: 090fb3cb-1ca4-e173-ecf7-47d41ebac620@iki.fi
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 02/04/2019 08:58, Andrey Lepikhov wrote:
> On 25/03/2019 15:21, Heikki Linnakangas wrote:
>> I had another quick look.
>>
>> I still think using the "generic xlog AM" for this is a wrong level of
>> abstraction, and we should use the XLOG_FPI records for this directly.
>> We can extend XLOG_FPI so that it can store multiple pages in a single
>> record, if it doesn't already handle it.
>>
>> Another counter-point to using the generic xlog record is that you're
>> currently doing unnecessary two memcpy's of all pages in the index, in
>> GenericXLogRegisterBuffer() and GenericXLogFinish(). That's not free.
>>
>> I guess the generic_log_relation() function can stay where it is, but it
>> should use XLogRegisterBuffer() and XLogInsert() directly.
>
> Patch set v.3 uses XLOG_FPI records directly.

Thanks! Committed, with some changes:

* I moved the log_relation() function to xlog.c, so that it sits beside
log_newpage_*() functions. I renamed it to log_newpage_range(), and
changed the argument so that the caller provides the beginning and end
block ranges. I added a 'page_std' flag, instead of just assuming that
all pages use the standard page layout. All of the callers pass
page_std=true at the moment, but seems better to be explicit about it.

I made those changes because I felt that the function was too narrowly
tailored for the current callers. The assumption about standard page
layout, and also the fact that it only logged the main fork. It's more
flexible now, for any future AMs that might not be exactly like that. It
feels like it's at the same level of abstraction now as the other
log_newpage_*() functions. Even if we never need the flexibility, I
think making the 'page_std' and 'forknum' arguments explicit is good, to
draw attention to those details, for anyone calling the function.

* I fixed the REDO code. It was trivially broken, it only restored the
first page in each FPI WAL record.

* Using "fake" unlogged LSNs for GiST index build seemed fishy. I could
not convince myself that it was safe in all corner cases. In a recently
initdb'd cluster, it's theoretically possible that the fake LSN counter
overtakes the real LSN value, and that could lead to strange behavior.
For example, how would the buffer manager behave, if there was a dirty
page in the buffer cache with an LSN value that's greater than the
current WAL flush pointer? I think you'd get "ERROR: xlog flush request
%X/%X is not satisfied --- flushed only to %X/%X".

I changed that so that we use LSN 1 for all pages during index build.
That's smaller than any real or fake LSN. Or actually, the fake LSN
counter used to start at 1 - I bumped that up to 1000, so now it's
safely smaller. Could've used 0 instead, but there's an assertion in
gist scan code that didn't like that.

> As a benchmark I use the script (test.sql in attachment) which show WAL
> size increment during index build. In the table below you can see the
> influence of the patch on WAL growth.
>
> Results
> =======
> AM | master | patch |
> GIN | 347 MB | 66 MB |
> GiST | 157 MB | 43 MB |
> SP-GiST | 119 MB | 38 MB |

Nice!

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-04-03 14:16:02 Re: [HACKERS] WAL logging problem in 9.4.3?
Previous Message Robert Haas 2019-04-03 14:02:05 Re: [Patch] Mingw: Fix import library extension, build actual static libraries