Re: logging in high performance systems.

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: logging in high performance systems.
Date: 2011-11-24 04:45:24
Message-ID: 4ECDCBE4.3040409@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/23/2011 09:28 PM, Theo Schlossnagle wrote:
> The second thing I did was write a sample use of those hooks to
> implement a completely non-blocking fifo logger. (if it would block,
> it drops the log line). The concept is that we could run this without
> risk of negative performance impact due to slow log reading (choosing
> to drop logs in lieu of pausing). And a simple process could be
> written to consume from the fifo.

This was one of the topics at the last developer's meeting you might not
have seen go by:
http://wiki.postgresql.org/wiki/PgCon_2011_Developer_Meeting#Improving_Logging
There was a reference to a pipe-based implementation from Magnus that I
haven't gotten a chance to track down yet. I think this area is going
to start hitting a lot more people in the upcoming couple of years,
since I'm seeing it increasingly at two customers I consider "canary in
a cole mine" sentinels for performance issues.

I'm now roughly considering three types of users here:

-Don't care about the overhead of logging, but are sick of parsing text
files. Would prefer the data be in a table instead.
-Concerned enough about overhead that statement-level logging is
impractical to log or table, but can cope with logging for other things.
-Logging rate can burst high enough that messages must start being
dropped instead no matter where they go. Before making a big change,
log file vs. table needs to be carefully explored to figure which of the
two approaches has more reasonable behavior/performance trade-offs.

I've been trying to attack this starting at the middle, with the
pg_stat_statements rework Peter here did for the current CommitFest. If
you've already worked out a way to simulate heavy logging as part of
what you've done here, I'd be quite interested to hear how capable you
feel it is for the class of problem you're seeing. I've always assumed
that pushing the most common queries into shared memory and only showing
them on demand, rather than logging them line at a time, could be a big
win for some places. We're still a bit light on benchmarks proving that
is the case so far though.

My assumption has been that eventually a lossy logger was going to be
necessary for busier sites, I just haven't been suffering from one
enough to hack on it yet. If it's possible to work this out in enough
detail to figure out where the hooks go, and to prove they work with at
least one consumer of them, I'd consider that a really useful thing to
try and squeeze into 9.2. The processing parts can always be further
improved later based on production feedback, going along with my recent
them of letting extensions that poke and probe existing hooks be one
place to brew next version features at.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-11-24 05:01:40 Re: pg_upgrade relation OID mismatches
Previous Message Rod Taylor 2011-11-24 04:45:02 Time bug with small years