Quick Links

Re: WAL prefetch

From:	Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <seanc(at)joyent(dot)com>
Subject:	Re: WAL prefetch
Date:	2018-06-27 09:44:25
Message-ID:	19b3c454-ca4c-bda5-6521-2f893f4451a9@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 22.06.2018 11:35, Konstantin Knizhnik wrote:
>
>
> On 21.06.2018 19:57, Tomas Vondra wrote:
>>
>>
>> On 06/21/2018 04:01 PM, Konstantin Knizhnik wrote:
>>> I continue my experiments with WAL prefetch.
>>> I have embedded prefetch in Postgres: now walprefetcher is started
>>> together with startup process and is able to help it to speedup
>>> recovery.
>>> The patch is attached.
>>>
>>> Unfortunately result is negative (at least at my desktop: SSD, 16Gb
>>> RAM). Recovery with prefetch is 3 times slower than without it.
>>> What I am doing:
>>>
>>> Configuration:
>>>      max_wal_size=min_wal_size=10Gb,
>>>      shared)buffers = 1Gb
>>> Database:
>>>       pgbench -i -s 1000
>>> Test:
>>>       pgbench -c 10 -M prepared -N -T 100 -P 1
>>>       pkill postgres
>>>       echo 3 > /proc/sys/vm/drop_caches
>>>       time pg_ctl -t 1000 -D pgsql -l logfile start
>>>
>>> Without prefetch it is 19 seconds (recovered about 4Gb of WAL), with
>>> prefetch it is about one minute. About 400k blocks are prefetched.
>>> CPU usage is small (<20%), both processes as in "Ds" state.
>>>
>>
>> Based on a quick test, my guess is that the patch is broken in
>> several ways. Firstly, with the patch attached (and
>> wal_prefetch_enabled=on, which I think is needed to enable the
>> prefetch) I can't even restart the server, because pg_ctl restart
>> just hangs (the walprefetcher process gets stuck in WaitForWAL, IIRC).
>>
>> I have added an elog(LOG,...) to walprefetcher.c, right before the
>> FilePrefetch call, and (a) I don't see any actual prefetch calls
>> during recovery but (b) I do see the prefetch happening during the
>> pgbench. That seems a bit ... wrong?
>>
>> Furthermore, you've added an extra
>>
>>     signal_child(BgWriterPID, SIGHUP);
>>
>> to SIGHUP_handler, which seems like a bug too. I don't have time to
>> investigate/debug this further.
>>
>> regards
>
> Sorry, updated version of the patch is attached.
> Please also notice that you can check number of prefetched pages using
> pg_stat_activity() - it is reported for walprefetcher process.
> Concerning the fact that you have no see prefetches at recovery time:
> please check that min_wal_size and max_wal_size are large enough and
> pgbench (or whatever else)
> committed large enough changes so that recovery will take some time.
>
>

I have improved my WAL prefetch patch. The main reason of slowdown
recovery speed with enabled prefetch was that it doesn't take in account
initialized pages (XLOG_HEAP_INIT_PAGE)
and doesn't remember (cache) full page writes.
The main differences of new version of the patch:

1. Use effective_cache_size as size of cache of prefetched blocks
2. Do not prefetch blocks sent in shared buffers
3. Do not prefetch blocks for RM_HEAP_ID with XLOG_HEAP_INIT_PAGE bit set
4. Remember new/fpw pages in prefetch cache, to avoid prefetch them for
subsequent WAL records.
5. Add min/max prefetch lead parameters to make it possible to
synchronize speed of prefetch with speed of replay.
6. Increase size of open file cache to avoid redundant open/close
operations.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment	Content-Type	Size
walprefetch-3.patch	text/x-patch	36.1 KB

In response to

Re: WAL prefetch at 2018-06-22 08:35:31 from Konstantin Knizhnik

Responses

Re: WAL prefetch at 2018-06-27 15:25:05 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Khandekar	2018-06-27 10:05:14	Re: Concurrency bug in UPDATE of partition-key
Previous Message	ERR ORR	2018-06-27 09:44:02	Re: Code of Conduct committee: call for volunteers