Re: [PATCH] Speedup truncates of relation forks

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>
Subject: Re: [PATCH] Speedup truncates of relation forks
Date: 2019-09-03 12:44:25
Message-ID: CAHGQGwHwgi55SLP7Zpgaea9-poveVCd+TRTNixdHDYEZ3LDhMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 24, 2019 at 9:58 AM Jamison, Kirk <k(dot)jamison(at)jp(dot)fujitsu(dot)com> wrote:
>
> Hi,
>
> I repeated the recovery performance test before, and found out that I made a
> wrong measurement.
> Using the same steps indicated in the previous email (24GB shared_buffers for my case),
> the recovery time still significantly improved compared to head
> from "13 minutes" to "4 minutes 44 seconds" //not 30 seconds.
> It's expected because the measurement of vacuum execution time (no failover)
> which I reported in the first email is about the same (although 5 minutes):
> > HEAD results
> > 3) 24GB shared_buffers = 14 min 13.598 s
> > PATCH results
> > 3) 24GB shared_buffers = 5 min 35.848 s
>
>
> Reattaching the patch here again. The V5 of the patch fixed the compile error
> mentioned before and mainly addressed the comments/advice of Sawada-san.
> - updated more accurate comments describing only current behavior, not history
> - updated function name: visibilitymap_truncate_prepare()
> - moved the setting of values for smgr_{fsm,vm}_nblocks inside the smgrtruncate()
>
> I'd be grateful if anyone could provide comments, advice, or insights.
> Thank you again in advance.

Thanks for the patch!

-smgrdounlinkfork(SMgrRelation reln, ForkNumber forknum, bool isRedo)
+smgrdounlinkfork(SMgrRelation reln, ForkNumber *forknum, int nforks,
bool isRedo)

smgrdounlinkfork() is dead code. Per the discussion [1], this unused
function was left intentionally. But it's still dead code since 2012,
so I'd like to remove it. Or, even if we decide to keep that function
for some reasons, I don't think that we need to update that so that
it can unlink multiple forks at once. So, what about keeping
smgrdounlinkfork() as it is?

[1]
https://www.postgresql.org/message-id/1471.1339106082@sss.pgh.pa.us

+ for (int i = 0; i < nforks; i++)

The variable "i" should not be declared in for loop
per PostgreSQL coding style.

+ /* Check with the lower bound block number and skip the loop */
+ if (bufHdr->tag.blockNum < minBlock)
+ continue; /* skip checking the buffer pool scan */

Because of the above code, the following source comment in bufmgr.c
should be updated.

* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.

And, first of all, is this check really useful for performance?
Since firstDelBlock for FSM fork is usually small,
minBlock would also be small. So I'm not sure how much
this is helpful for performance.

When relation is completely truncated at all (i.e., the number of block
to delete first is zero), can RelationTruncate() and smgr_redo() just
call smgrdounlinkall() like smgrDoPendingDeletes() does, instead of
calling MarkFreeSpaceMapTruncateRel(), visibilitymap_truncate_prepare()
and smgrtruncate()? ISTM that smgrdounlinkall() is faster and simpler.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomáš Záluský 2019-09-03 12:56:02 unexpected rowlock mode when trigger is on the table
Previous Message Peter Eisentraut 2019-09-03 12:28:17 Re: pg_dump --exclude-* options documentation