|From:||Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>|
|Cc:||tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, andres(at)anarazel(dot)de, stark(at)mit(dot)edu, a(dot)lubennikova(at)postgrespro(dot)ru, pgsql-hackers(at)postgresql(dot)org, g(dot)smolkin(at)postgrespro(dot)ru|
|Subject:||Re: Reopen logfile on SIGHUP|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
At Thu, 12 Apr 2018 17:23:42 +0300, Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote in <f9f32301-53b4-74cb-335a-c293911aed41(at)postgrespro(dot)ru>
> On 11.04.2018 00:00, Tom Lane wrote:
> > So we need a mechanism that's narrowly targeted
> > to reopening the logfile, without SIGHUP'ing the entire database.
> We can send SIGUSR1 to the syslogger process. To make its pid easier
> to find out, it can be published in "$PGDATA/logging_collector.pid",
> as suggested by Grigory. The attached patch does this. It also adds a
> brief description of how to use this with logrotate.
FWIW I'm not a fan of officially exposing logging collector PID
and let users send SIGUSR1 directly to the postmaster's internal
process. (It seems to me more unusual than pg_terminate_backed.)
We can provide a new command "pg_ctl logrotate" to hide the
details. (It cannot be executed by root, though.)
> > Point 2: Depending on how you've got the log filenames configured,
> > setting rotation_requested may result in a change in log filename
> If logrotate only needs the file to be reopened, syslogger's rotation
> does just than when using a static log file name. I imagine logrotate
> can be configured to do something useful with changing file names,
> too. It is a matter of keeping the configuration of syslogger and
> logrotate consistent.
Seems fine for me.
> > BTW, another thing that needs to be considered is the interaction with
> > rotation_disabled. Right now we automatically drop that on SIGHUP,
> > but
> > I'm unclear on whether it should be different for logrotate requests.
I feel the same, an explicit request from user ought to reset (or
ignore) it. (By the way, logrorate_disabled cannot be reset
without reloading config..)
> The SIGUSR1 path is supposed to be used by automated tools. In a
> sense, it is an automatic rotation, the difference being that it
> originates from an external tool and not from syslogger itself. So, it
> sounds plausible that the rotation request shouldn't touch the
> rotation_disabled flag, and should be disabled by it, just like the
> automatic rotation.
> Still, this leads us to a scenario where we can lose logs:
> 1. postgres is configured to use a static file name. logrotate is
> configured to move the file, send SIGUSR1 to postgres syslogger, gzip
> the file and delete it.
> 2. logrotate starts the rotation. It moves the file and signals
> postgres to reopen it.
> 3. postgres fails to reopen the file because there are too many files
> open (ENFILE/EMFILE), which is a normal occurrence on heavily loaded
> systems. Or it doesn't open the new file because the rotation_disable
> flag is set. It continues logging to the old file.
> 4. logrotate has no way to detect this failure, so it gzips the file
> and unlinks it.
> 5. postgres continues writing to the now unlinked file, and we lose an
> arbitrary amount of logs until the next successful rotation.
> With dynamic file names, logrotate can be told to skip open files, so
> that it doesn't touch our log file if we haven't switched to the new
> one. With a static file name, the log file is always open, so this
> method doesn't work. I'm not sure how to make this work reliably.
The loss is unavoidable by any means since logrotate works that
way by design. It doesn't care whether its peer did the work as
expected. Someone wants to avoid the loss can use copytruncate
for another kind of small loss that can happen at every rotation
time and we don't need to change anything in the case. Those who
want more reliability ought to use the PostgreSQL's genuine
NTT Open Source Software Center
|Next Message||Thomas Munro||2018-04-16 05:40:35||Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..|
|Previous Message||Michael Paquier||2018-04-16 02:31:50||Re: Gotchas about pg_verify_checksums|