Re: [PERFORM] DELETE vs TRUNCATE explanation

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Farina <daniel(at)heroku(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, "Harold A(dot) Giménez" <harold(dot)gimenez(at)gmail(dot)com>
Subject: Re: [PERFORM] DELETE vs TRUNCATE explanation
Date: 2012-07-18 03:22:22
Message-ID: 50062BEE.80700@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 07/16/2012 02:39 PM, Robert Haas wrote:
> Unfortunately, there are lots of important operations (like bulk
> loading, SELECT * FROM bigtable, and VACUUM notverybigtable) that
> inevitably end up writing out their own dirty buffers. And even when
> the background writer does write something, it's not always clear that
> this is a positive thing. Here's Greg Smith commenting on the
> more-is-worse phenonmenon:
>
> http://archives.postgresql.org/pgsql-hackers/2012-02/msg00564.php

You can add "crash recovery" to the list of things where the interaction
with the OS write cache matters a lot too, something I just took a
beating and learned from recently. Since the recovery process is
essentially one giant unified backend, how effectively the background
writer and/or checkpointer move writes from recovery to themselves is
really important. It's a bit easier to characterize than a complicated
mixed set of clients, which has given me a couple of ideas to chase down.

What I've been doing for much of the last month (instead of my original
plan of reviewing patches) is moving toward the bottom of characterizing
that under high pressure. It provides an even easier way to compare
multiple write strategies at the OS level than regular pgbench-like
benchmarks. Recovery playback with a different tuning becomes as simple
as rolling back to a simple base backup and replaying all the WAL,
possibly including some number of bulk operations that showed up. You
can measure that speed instead of transaction-level throughput. I'm
seeing the same ~100% difference in performance between various Linux
tunings on recovery as I was getting on VACUUM tests, and it's a whole
lot easier to setup and (ahem) replicate the results. I'm putting
together a playback time benchmark based on this observation.

The fact that I have servers all over the place now with >64GB worth of
RAM has turned the topic of how much dirty memory should be used for
write caching into a hot item for me again in general too. If I live
through 9.3 development, I expect to have a lot more ideas about how to
deal with this whole area play out in the upcoming months. I could
really use a cool day to sit outside thinking about it right now.

> Jeff Janes and I came up with what I believe to be a plausible
> explanation for the problem:
>
> http://archives.postgresql.org/pgsql-hackers/2012-03/msg00356.php
>
> I kinda think we ought to be looking at fixing that for 9.2, and
> perhaps even back-patching further, but nobody else seemed terribly
> excited about it.

FYI, I never rejected any of that thinking, I just haven't chewed on
what you two were proposing. If that's still something you think should
be revisited for 9.2, I'll take a longer look at it. My feeling on this
so far has really been that the write blocking issues are much larger
than the exact logic used by the background writer during the code you
were highlighting, which I always saw as more active/important during
idle periods. This whole area needs to get a complete overhaul during
9.3 though, especially since there are plenty of people who want to fit
checksum writes into that path too.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2012-07-18 04:00:08 Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)
Previous Message xu2002261 2012-07-18 02:16:35 Re: During Xlog replaying, is there maybe emitted xlog?

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Crooke 2012-07-18 03:51:35 Re: Linux memory zone reclaim
Previous Message Scott Marlowe 2012-07-18 02:00:35 Re: Linux memory zone reclaim