Re: [PERFORM] DELETE vs TRUNCATE explanation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Farina <daniel(at)heroku(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, Harold A(dot) Giménez <harold(dot)gimenez(at)gmail(dot)com>
Subject: Re: [PERFORM] DELETE vs TRUNCATE explanation
Date: 2012-07-16 18:39:58
Message-ID: CA+TgmoaEPLo5fmBX1JSkGK3umAamFVghwY7qDb4_QqzteZgUPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Mon, Jul 16, 2012 at 12:57 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> In my view, the elephant in the room here is that it's dramatically
>> inefficient for every backend to send an fsync request on every block
>> write.
>
> Yeah. This was better before the decision was taken to separate
> bgwriter from checkpointer; before that, only local communication was
> involved for the bulk of write operations (or at least so we hope).
> I remain less than convinced that that split was really a great idea.

Unfortunately, there are lots of important operations (like bulk
loading, SELECT * FROM bigtable, and VACUUM notverybigtable) that
inevitably end up writing out their own dirty buffers. And even when
the background writer does write something, it's not always clear that
this is a positive thing. Here's Greg Smith commenting on the
more-is-worse phenonmenon:

http://archives.postgresql.org/pgsql-hackers/2012-02/msg00564.php

Jeff Janes and I came up with what I believe to be a plausible
explanation for the problem:

http://archives.postgresql.org/pgsql-hackers/2012-03/msg00356.php

I kinda think we ought to be looking at fixing that for 9.2, and
perhaps even back-patching further, but nobody else seemed terribly
excited about it.

At any rate, I'm somewhat less convinced that the split was a good
idea than I was when we did it, mostly because we haven't really gone
anywhere with it subsequently. But I do think there's a good argument
that any process which is responsible for running a system call that
can take >30 seconds to return had better not be responsible for
anything else that matters very much. If background writing is one of
the things we do that doesn't matter very much, then we need to figure
out what's wrong with it (see above) and make it matter more. If it
already matters, then it needs to happen continuously and not get
suppressed while other tasks (like long fsyncs) are happening, at
least not without some evidence that such suppression is the right
choice from a performance standpoint.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-07-16 18:47:24 Re: CompactCheckpointerRequestQueue versus pad bytes
Previous Message Mike Wilson 2012-07-16 17:58:11 Re: BUG #6733: All Tables Empty After pg_upgrade (PG 9.2.0 beta 2)

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Thornton 2012-07-16 18:59:07 Re: very very slow inserts into very large table
Previous Message Jeff Janes 2012-07-16 18:28:33 Re: very very slow inserts into very large table