Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: l(at)lrowe(dot)co(dot)uk
Cc: PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Date: 2015-10-17 14:10:56
Message-ID: CAB7nPqTJtH6YuFbfuPZ2YyN7gP4i8hpRV0U33YPPC1icxcS60Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sat, Oct 17, 2015 at 5:30 AM, <l(at)lrowe(dot)co(dot)uk> wrote:
> I'm seeing Postgres 9.4.5 archive while idle every archive_timeout when I
> set ``wal_level hot_standby``.
> At ``wal_level archive`` I only see archiving every checkpoint_timeout
(that
> it archives every checkpoint_timeout is a known limitation, see
>
http://www.postgresql.org/message-id/1407389876762-5813999.post@n5.nabble.com
):
> I think this additional archiving at wal_level hot_standby is a bug.

Agreed. There is indeed a difference between the way 9.3 and 9.4 behave.
When wal_level = hot_standby, with 9.4 a segment will be archived depending
on archive_timeout as you mention, and that's not the case of 9.3. There is
definitely a regression here: we should not archive a segment if there is
no activity.

If I look at the contents of the segments with 9.4 when there is no
activity, I am seeing that actually a record XLOG_RUNNING_XACTS is
generated all the time after switching a segment, leading to the archiving
of this segment because server thinks that there is new data, and actually
there is, so the segment will be archived... Here is for example the output
of pg_xlogdump in this case:
$ pg_xlogdump 000000010000000000000018
rmgr: Standby len (rec/tot): 24/ 56, tx: 0, lsn:
0/18000028, prev 0/17000060, bkp: 0000, desc: running xacts: nextXid 1001
latestCompletedXid 1000 oldestRunningXid 1001
rmgr: XLOG len (rec/tot): 0/ 32, tx: 0, lsn:
0/18000060, prev 0/18000028, bkp: 0000, desc: xlog switch
[end of records for this segment]

A little bit of debugging is directing me to the bgwriter process,
LogStandbySnapshot() being called by BackgroundWriterMain(at)bgwriter(dot)c,
generating those WAL records even if a system is idle. I am adding Robert
and Andres in CC, as this is caused by commit ed46758 which is a new thing
of 9.4.

I think that a simple idea would be to not call LogStandbySnapshot() when
we are still at the beginning of a new segment. We know that the first page
of a WAL segment has a size of SizeOfXLogLongPHD, so just having a check on
that sounds enough to me. Per se the patch attached that should be applied
down to 9.4. This fixes the regression reported by Laurence for me.
Regards,
--
Michael

Attachment Content-Type Size
20151017_archive_idle.patch application/x-patch 937 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message billvelek 2015-10-19 07:31:02 BUG #13686: Near disaster installing PostgreSQL version 9.4.5
Previous Message 許耀彰 2015-10-16 23:26:45 postgresql table data control

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2015-10-17 14:15:24 Re: WIP: lookbehind constraints for our regexp engine
Previous Message Tom Lane 2015-10-17 14:03:39 Re: Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”