Quick Links

Re: WIP: WAL prefetch (another approach)

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: WIP: WAL prefetch (another approach)
Date:	2020-12-24 03:06:38
Message-ID:	CA+hUKGKFeYPL9K+SRixcsx1+6HsHhqK+POZyrnnZjw1jERpGcQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Dec 12, 2020 at 1:24 AM Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com> wrote:
> I wanted to contribute my findings - after dozens of various lengthy runs here - so far with WAL (asynchronous) recovery performance in the hot-standby case. TL;DR; this patch is awesome even on NVMe

Thanks Jakub! Some interesting, and nice, results.

> The startup/recovering gets into CPU 95% utilization territory with ~300k (?) hash_search_with_hash_value_memcmpopt() executions per second (measured using perf-probe).

I suppose it's possible that this is caused by memory stalls that
could be improved by teaching the prefetching pipeline to prefetch the
relevant cachelines of memory (but it seems like it should be a pretty
microscopic concern compared to the I/O).

> [3] - hash_search_with_hash_value() spends a lot of time near "callq *%r14" in tight loop assembly in my case (indirect call to hash comparision function). This hash_search_with_hash_value_memcmpopt() is just copycat function and instead directly calls memcmp() where it matters (smgr.c, buf_table.c). Blind shot at gcc's -flto also didn't help to gain a lot there (I was thinking it could optimize it by building many instances of hash_search_with_hash_value of per-match() use, but no). I did not quantify the benefit, I think it just failed optimization experiment, as it is still top#1 in my profiles, it could be even noise.

Nice. A related specialisation is size (key and object). Of course,
simplehash.h already does that, but it also makes some other choices
that make it unusable for the buffer mapping table. So I think that
we should either figure out how to fix that, or consider specialising
the dynahash lookup path with a similar template scheme.

Rebase attached.

Attachment	Content-Type	Size
v15-0001-Add-pg_atomic_unlocked_add_fetch_XXX.patch	text/x-patch	3.4 KB
v15-0002-Improve-information-about-received-WAL.patch	text/x-patch	7.8 KB
v15-0003-Provide-XLogReadAhead-to-decode-future-WAL-recor.patch	text/x-patch	60.0 KB
v15-0004-Prefetch-referenced-blocks-during-recovery.patch	text/x-patch	64.1 KB
v15-0005-WIP-Avoid-extra-buffer-lookup-when-prefetching-W.patch	text/x-patch	10.7 KB

In response to

RE: WIP: WAL prefetch (another approach) at 2020-12-11 12:24:29 from Jakub Wartak

Responses

Re: WIP: WAL prefetch (another approach) at 2020-12-30 03:57:36 from Andres Freund
Re: WIP: WAL prefetch (another approach) at 2021-02-04 00:40:26 from Tomas Vondra
Re: WIP: WAL prefetch (another approach) at 2021-02-10 21:50:33 from Stephen Frost

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bharath Rupireddy	2020-12-24 03:40:22	Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs
Previous Message	Noah Misch	2020-12-24 02:58:36	Re: [PATCH] Logical decoding of TRUNCATE