Skip site navigation (1) Skip section navigation (2)

Re: logging in high performance systems.

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: logging in high performance systems.
Date: 2011-11-24 04:45:24
Message-ID: 4ECDCBE4.3040409@2ndQuadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 11/23/2011 09:28 PM, Theo Schlossnagle wrote:
> The second thing I did was write a sample use of those hooks to
> implement a completely non-blocking fifo logger. (if it would block,
> it drops the log line).  The concept is that we could run this without
> risk of negative performance impact due to slow log reading (choosing
> to drop logs in lieu of pausing).  And a simple process could be
> written to consume from the fifo.

This was one of the topics at the last developer's meeting you might not 
have seen go by:  
http://wiki.postgresql.org/wiki/PgCon_2011_Developer_Meeting#Improving_Logging  
There was a reference to a pipe-based implementation from Magnus that I 
haven't gotten a chance to track down yet.  I think this area is going 
to start hitting a lot more people in the upcoming couple of years, 
since I'm seeing it increasingly at two customers I consider "canary in 
a cole mine" sentinels for performance issues.

I'm now roughly considering three types of users here:

-Don't care about the overhead of logging, but are sick of parsing text 
files.  Would prefer the data be in a table instead.
-Concerned enough about overhead that statement-level logging is 
impractical to log or table, but can cope with logging for other things.
-Logging rate can burst high enough that messages must start being 
dropped instead no matter where they go.  Before making a big change, 
log file vs. table needs to be carefully explored to figure which of the 
two approaches has more reasonable behavior/performance trade-offs.

I've been trying to attack this starting at the middle, with the 
pg_stat_statements rework Peter here did for the current CommitFest.  If 
you've already worked out a way to simulate heavy logging as part of 
what you've done here, I'd be quite interested to hear how capable you 
feel it is for the class of problem you're seeing.  I've always assumed 
that pushing the most common queries into shared memory and only showing 
them on demand, rather than logging them line at a time, could be a big 
win for some places.  We're still a bit light on benchmarks proving that 
is the case so far though.

My assumption has been that eventually a lossy logger was going to be 
necessary for busier sites, I just haven't been suffering from one 
enough to hack on it yet.  If it's possible to work this out in enough 
detail to figure out where the hooks go, and to prove they work with at 
least one consumer of them, I'd consider that a really useful thing to 
try and squeeze into 9.2.  The processing parts can always be further 
improved later based on production feedback, going along with my recent 
them of letting extensions that poke and probe existing hooks be one 
place to brew next version features at.

-- 
Greg Smith   2ndQuadrant US    greg(at)2ndQuadrant(dot)com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


In response to

Responses

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2011-11-24 05:01:40
Subject: Re: pg_upgrade relation OID mismatches
Previous:From: Rod TaylorDate: 2011-11-24 04:45:02
Subject: Time bug with small years

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group