Re: How could we make it simple to access the log as a table?

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Stephen Frost <sfrost(at)snowman(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How could we make it simple to access the log as a table?
Date: 2012-05-28 18:21:11
Message-ID: CAFNqd5UHBS-NEf5EJj_OrOCOBCsnDPiQchMFVNA0qKJ9fBu0gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 28, 2012 at 1:45 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On May 28, 2012, at 11:57 AM, Christopher Browne <cbbrowne(at)gmail(dot)com> wrote:
>> 2.  Ask Syslog
>>
>> My favorite way to configure *my* PG instances (e.g. - those that I
>> use for testing) is for them to forward messages to syslog.  That way
>> they, and my Slony test instances, are all logging to one common
>> place, rather than the logs getting strewn in a bunch of places.
>>
>> An FDW that could talk to syslog would be a nifty idea, though there
>> are enough different syslog implementations around to, again, injure
>> the simplicity of this.
>
> What does "talk to syslog" mean in this context?  Syslog doesn't store any data; it just routes it around.

Right, I guess that's a bit like saying, "let's have something
listening to procmail," when that's really just a filter.

If there was some output form that was particularly amenable to our
use, [e.g. - simple to configure via the "big red button" that you
suggest], that would be nice.

>> [Also, mumble, mumble, syslog might be forwarding to a remote server,
>> further complications...]
>>
>> 3.  Lossy logging is desired by some doing high performance systems
>> where they can't afford to capture everything
>>
>> http://archives.postgresql.org/pgsql-hackers/2011-11/msg01437.php
>>
>> One approach that I know Theo has used has been to throw events onto a
>> Spread channel, and have a listener pulling and aggregating the events
>> on a best-efforts basis.  I'm not sure if I should treat that as a
>> separate answer, or as part of the same.
>>
>> 4.  For a while, I had my syslog set up to capture logs into a
>> Postgres table.  Very cool, but pretty big slowdown.
>>
>> What's notably messy, right now, is that we've got a bunch of logging
>> targets where there's nothing resembling a uniform way of *accessing*
>> the logs.  It seems to me that the messiness and non-uniformity are
>> the tough part of the problem.
>
> Yeah, I agree.  I think what is missing here is something that can be read (and maybe indexed?) like a table, but written by a pretty dumb process.  It's not terribly workable to have PG log to PG, because there are too many situations where the problem you're trying to report would frustrate your attempt to report it.  At the other end of the spectrum, our default log format is easy to generate but (a) impoverished, not even including a time stamp by default and (b) hard to parse, especially because two customers with the same log_line_prefix is a rare nicety.  The  CSV format is both rich and machine-parseable (good start!) but it takes an unreasonable amount of work to make it usefully queryable.  We need something that looks more like a big red button.

There's a case to be made for some lossier "NoSQL-y" thing here. But
I'm not sure what size fits enough. I hate the idea of requiring the
deployment of *another* DBMS (however "lite"), but reading from text
files isn't particularly nice either.

Perhaps push the logs into an unlogged table on an extra PG instance,
where an FDW tries to make that accessible? A fair bit of process
needs to live behind that "big red button," and that's at least a
plausible answer.

What's needed is to figure out what restrictions are acceptable to
impose to have something that's "button-worthy."
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-05-28 19:11:20 Upcoming back-branch PG releases
Previous Message Tom Lane 2012-05-28 17:56:55 Re: Bogus nestloop rows estimate in 8.4.7