Re: [Patch] Optimize dropping of relation buffers using dlist

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-09-26 06:14:04
Message-ID: CAA4eK1KMhq-1Jx3ijJbeHzUhE7gFY-a4sfU9FN3XSjWqzkyfZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 25, 2020 at 1:49 PM k(dot)jamison(at)fujitsu(dot)com
<k(dot)jamison(at)fujitsu(dot)com> wrote:
>
> Hi.
>
> > I'll send performance measurement results in the next email. Thanks a lot for
> > the reviews!
>
> Below are the performance measurement results.
> I was only able to use low-spec machine:
> CPU 4v, Memory 8GB, RHEL, xfs filesystem.
>
> [Failover/Recovery Test]
> 1. (Master) Create table (ex. 10,000 tables). Insert data to tables.
> 2. (M) DELETE FROM TABLE (ex. all rows of 10,000 tables)
> 3. (Standby) To test with failover, pause the WAL replay on standby server.
> (SELECT pg_wal_replay_pause();)
> 4. (M) psql -c "\timing on" (measures total execution of SQL queries)
> 5. (M) VACUUM (whole db)
> 6. (M) After vacuum finishes, stop primary server: pg_ctl stop -w -mi
> 7. (S) Resume wal replay and promote standby.
> Because it's difficult to measure recovery time I used the attached script (resume.sh)
> that prints timestamp before and after promotion. It basically does the following
> - "SELECT pg_wal_replay_resume();" is executed and the WAL application is resumed.
> - "pg_ctl promote" to promote standby.
> - The time difference of "select pg_is_in_recovery();" from "t" to "f" is measured.
>
> [Results]
> Recovery/Failover performance (in seconds). 3 trial runs.
>
> | shared_buffers | master | patch | %reg |
> |----------------|--------|--------|---------|
> | 128MB | 32.406 | 33.785 | 4.08% |
> | 1GB | 36.188 | 32.747 | -10.51% |
> | 2GB | 41.996 | 32.88 | -27.73% |
>
> There's a bit of small regression with the default shared_buffers (128MB),
>

I feel we should try to address this. Basically, we can see the
smallest value of shared buffers above which the new algorithm is
beneficial and try to use that as threshold for doing this
optimization. I don't think it is beneficial to use this optimization
for a small value of shared_buffers.

> but as for the recovery time when we have large NBuffers, it's now at least almost constant
> so there's boosted performance. IOW, we enter the optimization most of the time
> during recovery.
>

Yeah, that is good to see. We can probably try to check with a much
larger value of shared buffers.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-09-26 06:16:15 Re: Logical replication from PG v13 and below to PG v14 (devel version) is not working.
Previous Message Amit Kapila 2020-09-26 06:09:43 Re: [Patch] Optimize dropping of relation buffers using dlist