Re: syslog_line_prefix

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joshua Tolley <eggyknap(at)gmail(dot)com>, jd <jd(at)commandprompt(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: syslog_line_prefix
Date: 2009-09-28 10:51:37
Message-ID: 9837222c0909280351k36f46206m7d199452c1bb2415@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/9/28 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Sep 28, 2009 at 5:22 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> On Sun, Sep 27, 2009 at 23:03, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Sun, Sep 27, 2009 at 4:54 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>>>> On Sun, 2009-09-27 at 16:15 -0400, Tom Lane wrote:
>>>>> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>>>>> > Then why not send everything to syslog and have syslog filter it to the
>>>>> > places you want to?  That is what syslog is for, after all.
>>>>>
>>>>> We send all syslog output with the same identifier/priority/facility,
>>>>> so there's not a lot of hope of getting syslog to do any useful
>>>>> filtering (at least not with the versions of syslog I'm familiar with).
>>>>
>>>> Time to upgrade then. ;-)  For example, the default syslog in Fedora has
>>>> been rsyslog since Fedora 8, and that one can do a lot more than just
>>>> filter by identifier/priority/facility.  syslog-ng is another popular
>>>> example for a more featureful syslog.
>>>
>>> Presumably csvlog would be good for these sorts of things too, no?
>>> The whole point is it's machine-readable.
>>
>> If there was a way to pipe the csv log through an external problem,
>> that would take care of much of the problem.
>
> tail -f is probably a bit too fragile for this purpose, but I think it
> would be possible to design a utility that would do this.  The idea
> would be to maintain a state file that would list the most recent CSV
> log file read and the byte offset of the first byte following the last
> line processed.  On every iteration, we just open up the last file
> read and read beginning at the designated offset through end of file.
> Then we check if a newer file is available and, if so, we begin
> reading that file.  When we're done reading, we update the state file.

That would mean we have to write everything to the file, though, so it
would be rather bad for the case where you want to log "just a little"
but are "delegating" the decision to the external process. And it
would create double the I/O on disk for the logfile (once to the csv
log, once processed by the external process).

> There is the problem of what happens if we read a partial last line of
> a file being written, but that seems surmountable: if the last line
> read does not end in a newline, and no newer file is present, then
> don't include that partial line in the output, and record the offset
> of the beginning of that line in the state file.
>
> I'm not sure if this will work on Windows, but it should be OK on
> anything UNIX-ish.

Well, there'll be dealing with the sharing violations and stuff, but
we just need to make sure that the syslogger would open the file with
the proper sharing flags. Which I think it does already, actaully.

>> And I guess if you make that program responsible for *everything* it
>> would work - you just set your logging level to log very much data,
>> and let the external process deal with it. If we implemented the
>> ability to have a different logging level for different destinations
>> you could keep text logging for other things, or you could just
>> delegate all that to the external process as well. That would
>> basically turn the syslogger into a process that reads from the input
>> and sends the data out to an external process. But it could then
>> implement things like automatic restart of the external process in
>> case of crash etc, in perhaps a much easier way than the postmaster
>> can do for the syslogger itself.
>
> The problem with having the syslogger send the data directly to an
> external process is that the external process might be unable to
> process the data as fast as syslogger is sending it.  I'm not sure
> exactly what will happen in that case, but it will definitely be bad.
> I think what will likely happen is that the entire database cluster
> will end up waiting on write(2) calls to various places and processing
> will grind to a halt.

We'll have the same issue if we have the syslogger write it to disk,
don't we? In fact, it might even be faster depending on how much
processing is done and what can be thrown away at that step, since it
could decrease the disk I/O needed in favor of CPU work.

> I think it's better to spool the log messages to files, and then let
> the external utility read the files.  The external utility can still
> fall behind, but even if it does the cluster will continue running.

The difficulty there is to make it "live enough". But I guess if it
implements the same method as tail -f, it would do that - the only
issue then would be the fact that this would require much more I/O on
disk.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2009-09-28 11:35:03 Re: Rejecting weak passwords
Previous Message Robert Haas 2009-09-28 10:43:32 Re: syslog_line_prefix