Re: increasing the default WAL segment size

From: Andres Freund <andres(at)anarazel(dot)de>
To: Beena Emerson <memissemerson(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: increasing the default WAL segment size
Date: 2017-09-13 09:28:28
Message-ID: 20170913092828.aozd3gvvmw67gmyc@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017-09-06 20:24:16 +0530, Beena Emerson wrote:
> > - pg_standby's RetrieveWALSegSize() does too much for it's name. It
> > seems quite weird that a function named that way has the section below
> > "/* check if clean up is necessary */"
>
> we set 2 cleanup related variables once WalSegSize is set, namely
> need_cleanup and exclusiveCleanupFileName. Does
> SetWALSegSizeAndCleanupValues look good?

It's better, but see below.

> > - the way you redid the ReadControlFile() invocation doesn't quite seem
> > right. Consider what happens if XLOGbuffers isn't -1 - then we
> > wouldn't read the control file, but you unconditionally copy it in
> > XLOGShmemInit(). I think we instead should introduce something like
> > XLOGPreShmemInit() that reads the control file unless in bootstrap
> > mode. Then get rid of the second ReadControlFile() already present.
>
> I did not think it was necessary to create a new function, I have
> simply added the check and
> function call within the XLOGShmemInit().

Which is wrong. XLogShmemSize() already needs to know the actual size,
otherwise we allocate the wrong shmem size. You may sometimes succeed
nevertheless because we leave some slop unused shared memory space, but
it's not ok to rely on. See the refactoring I did in 0001.

Changes:
- refactored the way the control file was handled, moved it to separate
phase. I wrote this last and it's late, so I'm not yet fully confident
in it, but it survives plain and EXEC_BACKEND builds. This also gets
rid of ferrying wal_segment_size through the EXEC_BACKEND variable
stuff, which didn't really do much, given how many other parts weren't
carried over.
- renamed all the non-postgres binary version of wal_segment_size to
WalSegSz, diverging seems pointless, and the WalSegsz seems
inconsistent.
- changed malloc in pg_waldump's search_directory() to a stack
allocation. Less because of efficiency, more because there wasn't any
error handling.
- removed redundant char * -> XLogPageHeader -> XLogLongPageHeader casting.
- replace new malloc with pg_malloc in initdb (failure handling!)
- replaced the floating point logic in pretty_wal_size with a, imo much
simpler, (sz % 1024) == 0
- it's inconsistent that the new code for pg_standby was added to the
top of the file, where all the customizable stuff resides.
- other small changes

Issues:

- I think the pg_standby stuff isn't correct. And it's hard to
understand. Consider the case where the first file restored is *not* a
timeline history file, but *also* not a complete file. We'll start to
spew "not enough data in file" errors and such, which we previously
didn't. My preferred solution would be to remove pg_standby ([1]),
but that's probably not quick enough. Unless we can quickly agree on
that, I think we need to refactor this a bit, I've done so in the
attached, but it's untested. Could you please verify it works and if
not fix it up?

What do you think?

Regards,

Andres

[1] http://archives.postgresql.org/message-id/20170913064824.rqflkadxwpboabgw%40alap3.anarazel.de

Attachment Content-Type Size
0001-Perform-only-one-ReadControlFile-during-startupv3 text/plain 5.4 KB
0002-Make-wal-segment-size-configurable-at-initdb-timev3 text/plain 123.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2017-09-13 09:38:00 Re: UPDATE of partition key
Previous Message Fabien COELHO 2017-09-13 09:27:16 Re: psql - add special variable to reflect the last query status