Re: slow dropping of tables, DropRelFileNodeBuffers, tas

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: slow dropping of tables, DropRelFileNodeBuffers, tas
Date: 2012-05-30 12:34:39
Message-ID: CA+Tgmob_9hMf+Yu3R+Gq-kdo0MaGfSX7Vmgw+b32SUE63zvexg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 30, 2012 at 7:10 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> So we drop the buffers for each relation fork separately, which means that
> we scan the buffer pool four times. Relation forks in 8.4 introduced that
> issue, and 9.1 made it worse by adding another fork for unlogged tables.
> With some refactoring, we could scan the buffer pool just once. That would
> help a lot.

+1.

> Also, I wonder if DropRelFileNodeBuffers() could scan the pool without
> grabbing the spinlocks on every buffer? It could do an unlocked test first,
> and only grab the spinlock on buffers that need to be dropped.

I think it would be possible for the unlocked test to indicate that
the buffer should be dropped when it really ought not to be, because
someone else might be in the middle of changing the buffer tag, and
that's not atomic. So you'd have to recheck after taking the
spinlock. However, I don't think it's possible for the unlocked test
to report a false negative, because we've already taken
AccessExclusiveLock on the relation, which had better be enough to
guarantee that nobody's pulling in any more buffers from that relation
(if it doesn't guarantee that, the current code is already broken).
Acquiring a heavyweight lock also interposes a full memory barrier,
which should eliminate any risks due to memory-ordering effects.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-05-30 12:46:19 Re: WalSndWakeup() and synchronous_commit=off
Previous Message Robert Haas 2012-05-30 12:24:55 Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)