Re: drop/truncate table sucks for large values of shared buffers

From: Gurjeet Singh <gurjeet(at)singh(dot)im>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop/truncate table sucks for large values of shared buffers
Date: 2015-06-27 14:36:14
Message-ID: CABwTF4UCG=kKK+u1WJRbetHP7hdvdPcc_o2wSmOo9TUJxY2pig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 26, 2015 at 9:45 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> Sometime back on one of the PostgreSQL blog [1], there was
> discussion about the performance of drop/truncate table for
> large values of shared_buffers and it seems that as the value
> of shared_buffers increase the performance of drop/truncate
> table becomes worse. I think those are not often used operations,
> so it never became priority to look into improving them if possible.
>
> I have looked into it and found that the main reason for such
> a behaviour is that for those operations it traverses whole
> shared_buffers and it seems to me that we don't need that
> especially for not-so-big tables. We can optimize that path
> by looking into buff mapping table for the pages that exist in
> shared_buffers for the case when table size is less than some
> threshold (say 25%) of shared buffers.
>
> Attached patch implements the above idea and I found that
> performance doesn't dip much with patch even with large value
> of shared_buffers. I have also attached script and sql file used
> to take performance data.
>

+1 for the effort to improve this.

With your technique added, there are 3 possible ways the search can happen
a) Scan NBuffers and scan list of relations, b) Scan NBuffers and bsearch
list of relations, and c) Scan list of relations and then invalidate blocks
of each fork from shared buffers. Would it be worth it finding one
technique that can serve decently from the low-end shared_buffers to the
high-end.

On patch:

There are multiple naming styles being used in DropForkSpecificBuffers();
my_name and myName. Given this is a new function, it'd help to be
consistent.

s/blk_count/blockNum/

s/new//, for eg. newTag, because there's no corresponding tag/oldTag
variable in the function.

s/blocksToDel/blocksToDrop/. BTW, we never pass anything other than the
total number of blocks in the fork, so we may as well call it just
numBlocks.

s/traverse_buf_freelist/scan_shared_buffers/, because when it is true, we
scan the whole shared_buffers.

s/rel_count/rel_num/

Reduce indentation/tab in header-comments of DropForkSpecificBuffers(). But
I see there's precedent in neighboring functions, so this may be okay.

Doing pfree() of num_blocks, num_fsm_blocks and num_vm_blocks in one place
(instead of two, at different indentation levels) would help readability.

Best regards,
--
Gurjeet Singh http://gurjeet.singh.im/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-06-27 15:38:45 Re: drop/truncate table sucks for large values of shared buffers
Previous Message Tom Lane 2015-06-27 14:27:53 Bogus postmaster-only contexts laying about in backends