Re: patch to allow disable of WAL recycling

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch to allow disable of WAL recycling
Date: 2018-07-16 14:12:48
Message-ID: 19196.1531750368@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2018-07-15 20:55:38 -0400, Tom Lane wrote:
>> That's not the way to think about it. On a COW file system, we don't
>> want to "create 16MB files" at all --- we should just fill WAL files
>> on-the-fly, because the pre-fill activity isn't actually serving the
>> intended purpose of reserving disk space. It's just completely useless
>> overhead :-(. So we can't really make a direct comparison between the
>> two approaches; there's no good way to net out the cost of constructing
>> the WAL data we need to write.

> We probably should still allocate them in 16MB segments. We rely on the
> size being fixed in a number of places.

Reasonable point. I was supposing that it'd be okay if a partially
written segment were shorter than 16MB, but you're right that that
would require vetting a lot of code to be sure about it.

> But it's probably worthwhile to
> just do a posix_fadvise or such. Also, if we continually increase the
> size with each write we end up doing a lot more metadata transactions,
> which'll essentially serve to increase journalling overhead further.

Hm. What you're claiming is that on these FSen, extending a file involves
more/different metadata activity than allocating new space for a COW
overwrite of an existing area within the file. Is that really true?
The former case would be far more common in typical usage, and somehow
I doubt the FS authors would have been too stupid to optimize things so
that the same journal entry can record both the space allocation and the
logical-EOF change.

But anyway, this means we have two nearly independent issues to
investigate: whether recycling/renaming old files is cheaper than
constantly creating and deleting them, and whether to use physical
file zeroing versus some "just set the EOF please" filesystem call
when first creating a file. The former does seem like it's purely
a performance question, but the latter involves a tradeoff of
performance against an ENOSPC-panic protection feature that in
reality only works on some filesystems.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-07-16 14:25:55 Re: cursors with prepared statements
Previous Message Tom Lane 2018-07-16 13:55:48 Re: Usage of epoch in txid_current