Re: DELETE vs TRUNCATE explanation

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Harold A(dot) Giménez <harold(dot)gimenez(at)gmail(dot)com>
Cc: Daniel Farina <daniel(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: DELETE vs TRUNCATE explanation
Date: 2012-07-13 01:00:49
Message-ID: CAMkU=1zJK6WiS9-6eO8rNmRpKSSB4=uw2mTsD=k64mNh__=Dcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Thu, Jul 12, 2012 at 4:21 PM, Harold A. Giménez
<harold(dot)gimenez(at)gmail(dot)com> wrote:
>
> > What is shared_buffers?
>
>
> 1600kB

That is really small, so the buffer flushing should not be a problem.
Unless you mean 1600MB.

> > > This is a rather small schema -- probably a half a dozen tables, and
> > > probably about a dozen indexes. This application is entirely
> > > unremarkable in its test-database workload: it wants to load a few
> > > records, do a few things, and then clear those handful of records.
> >
> > How many rounds of truncation does one rake do? I.e. how many
> > truncations are occurring over the course of that 1 minute or 15
> > minutes?
>
> All tables are cleared out after every test. On this particular project, I'm
> running 200+ tests in 1.5 minutes (or 30 seconds with DELETE instead of
> TRUNCATE). For another, bigger project it's running 1700+ tests in about a
> minute. You can do the math from there.

so 1700 rounds * 18 relations = truncates 30,600 per minute.

That is actually faster than I get truncates to go when I am purely
limited by CPU.

I think the problem is in the Fsync Absorption queue. Every truncate
adds a FORGET_RELATION_FSYNC to the queue, and processing each one of
those leads to sequential scanning the checkpointer's pending ops hash
table, which is quite large. It is almost entirely full of other
requests which have already been canceled, but it still has to dig
through them all. So this is essentially an N^2 operation.

I'm not sure why we don't just delete the entry instead of marking it
as cancelled. It looks like the only problem is that you can't delete
an entry other than the one just returned by hash_seq_search. Which
would be fine, as that is the entry that we would want to delete;
except that mdsync might have a different hash_seq_search open, and so
it wouldn't be safe to delete.

If the segno was taken out of the hash key and handled some other way,
then the forgetting could be done with a simple hash look up rather
than a full scan.

Maybe we could just turn off the pending ops table altogether when
fsync=off, but since fsync is PGC_SIGHUP it is not clear how you could
safely turn it back on.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2012-07-13 01:01:00 Re: has_language_privilege returns incorrect answer for non-superuser
Previous Message Aidan Van Dyk 2012-07-13 00:59:43 Re: Synchronous Standalone Master Redoux

Browse pgsql-performance by date

  From Date Subject
Next Message Yan Chunlu 2012-07-13 04:02:02 Re: how could select id=xx so slow?
Previous Message Harold A. Giménez 2012-07-12 23:21:13 Re: DELETE vs TRUNCATE explanation