Re: Handing off SLRU fsyncs to the checkpointer

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Handing off SLRU fsyncs to the checkpointer
Date: 2020-08-27 02:04:51
Message-ID: CA+hUKG+1aJVnSWxS2bt1D4dM4zPVPZWikoiKmvKp8rVDosCtJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 27, 2020 at 6:15 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > --4.90%--smgropen
> > |--2.86%--ReadBufferWithoutRelcache
>
> Looking at an earlier report of this problem I was thinking whether it'd
> make sense to replace SMgrRelationHash with a simplehash table; I have a
> half-written patch for that, but I haven't completed that work.
> However, in the older profile things were looking different, as
> hash_search_with_hash_value was taking 35.25%, and smgropen was 33.74%
> of it. BufTableLookup was also there but only 1.51%. So I'm not so
> sure now that that'll pay off as clearly as I had hoped.

Right, my hypothesis requires an uncacheably large buffer mapping
table, and I think smgropen() needs a different explanation because
it's not expected to be as large or as random, at least not with a
pgbench workload. I think the reasons for a profile with a smgropen()
showing up so high, and in particular higher than BufTableLookup(),
must be:

1. We call smgropen() twice for every call to BufTableLookup(). Once
in XLogReadBufferExtended(), and then again in
ReadBufferWithoutRelcache().
2. We also call it for every block forced out of the buffer pool, and
in recovery that has to be done by the recovery loop.
3. We also call it for every block in the buffer pool during the
end-of-recovery checkpoint.

Not sure but the last two might perform worse due to proximity to
interleaving pwrite() system calls (just a thought, not investigated).
In any case, I'm going to propose we move those things out of the
recovery loop, in a new thread.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2020-08-27 02:26:21 Re: recovering from "found xmin ... from before relfrozenxid ..."
Previous Message tsunakawa.takay@fujitsu.com 2020-08-27 01:34:26 RE: Implement UNLOGGED clause for COPY FROM