RE: [Patch] Optimize dropping of relation buffers using dlist

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-10-01 01:55:10
Message-ID: OSBPR01MB23414BB82BD0B343EEDAADBBEF300@OSBPR01MB2341.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I revised the patch again. Attached is V19.
The previous patch's algorithm missed entering the optimization loop.
So I corrected that and removed the extra function I added
in the previous versions.

The revised patch goes something like this:
for (forks of rel)
{
if (smgrcachednblocks() == InvalidBlockNumber)
break; //go to full scan
if (nBlocksToInvalidate < buf_full_scan_threshold)
for (blocks of the fork)
else
break; //go to full scan
}
<execute full scan>

Recovery performance measurement results below.
But it seems there are overhead even with large shared buffers.

| s_b | master | patched | %reg |
|-------|--------|---------|-------|
| 128MB | 36.052 | 39.451 | 8.62% |
| 1GB | 21.731 | 21.73 | 0.00% |
| 20GB | 24.534 | 25.137 | 2.40% |
| 100GB | 30.54 | 31.541 | 3.17% |

I'll investigate further. Or if you have any feedback or advice, I'd appreciate it.

Machine specs used for testing:
RHEL7, 8 core, 256 GB RAM, xfs

Configuration:
wal_level = replica
autovacuum = off
full_page_writes = off

# For streaming replication from primary.
synchronous_commit = remote_write
synchronous_standby_names = ''

# For Standby.
#hot_standby = on
#primary_conninfo

shared_buffers = 128MB
# 1GB, 20GB, 100GB

Just in case it helps for some understanding,
I also attached the recovery log 018_wal_optimize_node_replica.log
with some ereport that prints whether we enter the optimization loop or do full scan.

Regards,
Kirk Jamison

Attachment Content-Type Size
v19-Optimize-DropRelFileNodeBuffers-during-recovery.patch application/octet-stream 8.3 KB
v1-Prevent-invalidating-blocks-in-smgrextend-during-recovery.patch application/octet-stream 1.1 KB
018_wal_optimize_node_replica.log application/octet-stream 156.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-10-01 02:16:10 terminate called after throwing an instance of 'std::bad_alloc'
Previous Message tsunakawa.takay@fujitsu.com 2020-10-01 01:50:56 RE: New statistics for tuning WAL buffer size