Skip site navigation (1) Skip section navigation (2)

Re: Load Distributed Checkpoints, take 3

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Load Distributed Checkpoints, take 3
Date: 2007-06-23 08:59:27
Message-ID: Pine.GSO.4.64.0706221659310.6983@westnet.com (view raw or flat)
Thread:
Lists: pgsql-patches
This message is going to come off as kind of angry, and I hope you don't 
take that personally.  I'm very frustrated with this whole area right now 
but am unable to do anything to improve that situation.

On Fri, 22 Jun 2007, Tom Lane wrote:

> If you've got specific evidence why any of these things need to be 
> parameterized, let's see it.

All I'm trying to suggest here is that you might want to pause and 
consider whether you want to make a change that might break existing, 
happily working installations based just on the small number of tests that 
have been done on this patch so far.  A nice stack of DBT2 results is very 
informative, but the DBT2 workload is not everybody's workload.

Did you see anybody else predicting issues with the LDC patch on 
overloaded systems as are starting to be seen in the 150 warehouse/90% 
latency figures in Heikki's most recent results?  The way I remember that, 
it was just me pushing to expose that problem, because I knew it was there 
from my unfortunately private tests, but it was difficult to encounter the 
issue on other types of benchmarks (thanks again to Greg Stark and Heikki 
for helping with that).  But that's fine, if you want to blow off the rest 
of my suggestions now just because the other things I'm worried about are 
also very hard problem to expose and I can't hand you over a smoking gun, 
that's your decision.

> Personally I think that we have a bad track record of exposing GUC 
> variables as a substitute for understanding performance issues at the 
> start, and this approach isn't doing any favors for DBAs.

I think this project has an awful track record of introducing new GUC 
variables and never having a plan to follow through with a process to 
figure out how they should be set.  The almost complete lack of 
standardization and useful tools for collecting performance information 
about this database boggles my mind, and you're never going to get the 
performance related sections of the GUC streamlined without it.

We were just talking about the mess that is effective_cache_size recently. 
As a more topical example here, the background writer was officially 
released in early 2005, with a bizarre collection of tunables.  I had to 
help hack on that code myself, over two years later, to even start 
exposing the internal statistics data needed to optimize it correctly. 
The main reason I can't prove some of my concerns is that I got so 
side-tracked adding the infrastructure needed to even show they exist that 
I wasn't able to nail down exactly what was going on well enough to 
generate a public test case before the project that exposed the issues 
wrapped up.

> Right at the moment the best thing to do seems to be to enable LDC with 
> a low minimum write rate and a high target duration, and remove the 
> thereby-obsoleted "all buffers" scan of the existing bgwriter logic.

I have reason to believe there's a set of use cases where a more 
accelerated LDC approach than everyone seems to be learning toward is 
appropriate, which would then reinvigorate the need for the all-scan BGW 
component.  I have a whole new design for the non-LRU background writer 
that fixes most of what's wrong with it I'm waiting for 8.4 to pass out 
and get feedback on, but if everybody is hell bent on just yanking the 
whole thing out in preference to these really lazy checkpoints go ahead 
and do what you want.  My life would be easier if I just tossed all that 
out and forgot about the whole thing, and I'm real close to doing just 
that right now.

>> Did anyone else ever notice that when a new xlog segment is created,
>> the write to clear it out doesn't happen via direct I/O like the rest
>> of the xlog writes do?
> It's not supposed to matter, because that path isn't supposed to be
> taken often.

Yes, but during the situation it does happen in--when checkpoints take so 
much longer than expected that more segments have to be created, or in an 
archive logger faiure--it badly impacts an already unpleasant situation.

>> there's a whole class of issues involving recycling xlog segments this
>> would introduce I would be really unhappy with the implications of.
> Really?  Name one.

You already mentioned expansion of the log segments used which is a 
primary issue.  Acting like all the additional segments used for some of 
the more extreme checkpoint spreading approaches are without cost is 
completely unrealistic IMHO.  In the situation I just described above, I 
also noticed the way O_DIRECT sync writes get mixed with buffered WAL 
writes seems to cause some weird I/O scheduling issues in Linux that can 
make worst-case latency degrade.  But since I can't prove that, I guess I 
might as well not even mention that either.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

pgsql-patches by date

Next:From: Magnus HaganderDate: 2007-06-23 12:53:03
Subject: Re: Preliminary GSSAPI Patches
Previous:From: Magnus HaganderDate: 2007-06-23 08:44:38
Subject: Re: Preliminary GSSAPI Patches

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group