Re: automatic analyze: readahead - add "IO read time" log message

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: automatic analyze: readahead - add "IO read time" log message
Date: 2020-11-09 18:19:59
Message-ID: ef69691a-8ba4-18af-6d0f-adbacb43967a@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/9/20 7:06 PM, Stephen Frost wrote:
> Greetings,
>
> * Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
>> On 11/4/20 5:02 PM, Stephen Frost wrote:
>>> * Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
>>>>> If you highlight "738754560" in the output it appears to duplicate the
>>>>> syscalls issued until it preads() - in case of "738754560" offset it was
>>>>> asked for 3 times. Also I wouldn't imagine in wildest dreams that
>>>>> posix_fadvise(POSIX_FADV_WILLNEED) is such a cheap syscall.
>>>>
>>>> IMHO that'a a bug in the patch, which always tries to prefetch all
>>>> "future" blocks, including those that were already prefetched. It
>>>> probably needs to do something like bitmap heap scan where we track
>>>> what was already prefetched and only issue the new blocks.
>>>
>>> Updated patch attached which:
>>>
>>> - Starts out by pre-fetching the first effective_io_concurrency number
>>> of blocks we are going to want, hopefully making it so the kernel will
>>> trust our fadvise's over its own read-ahead, right from the start.
>>> - Makes sure the prefetch iterator is pushed forward whenever the
>>> regular interator is moved forward.
>>> - After each page read, issues a prefetch, similar to BitmapHeapScan, to
>>> hopefully avoiding having the prefetching get in the way of the
>>> regular i/o.
>>> - Added some comments, ran pgindent, added a commit message.
>>
>> Nice, that was quick ;-)
>
> :)
>
>>> I do think we should also include patch that Jakub wrote previously
>>> which adds information about the read rate of ANALYZE.
>>
>> +1
>
> Attached is an updated patch which updates the documentation and
> integrates Jakub's initial work on improving the logging around
> auto-analyze (and I made the logging in auto-vacuum more-or-less match
> it).
>

Thanks. I'll do some testing/benchmarking once my machines are free, in
a couple days perhaps. But as I said before, I don't expect this to
behave very differently from other places that already do prefetching.

>>> I'll look at integrating that into this patch and then look at a new
>>> patch to do something similar for VACUUM in a bit.
>>
>> +1
>
> I spent some time looking into this but it's a bit complicated.. For
> some sound reasons, VACUUM will avoid skipping through a table when
> there's only a few pages that it could skip (not skipping allows us to
> move forward the relfrozenxid). That said, perhaps we could start doing
> prefetching once we've decided that we're skipping. We'd need to think
> about if we have to worry about the VM changing between the pre-fetching
> and the time when we're actually going to ask for the page.. I don't
> *think* that's an issue because only VACUUM would be changing the pages
> to be all-frozen or all-visible, and so if we see a page that isn't one
> of those then we're going to want to visit that page and that's not
> going to change, and we probably don't need to care about a page that
> used to be all-frozen and now isn't during this run- but if the prefetch
> went ahead and got page 10, and now page 8 is not-all-frozen and the
> actual scan is at page 5, then maybe it wants page 8 next and that isn't
> what we pre-fetched...
>
> Anyhow, all-in-all, definitely more complicated and probably best
> considered and discussed independently>

+1

FWIW I wonder if this should be tracked separately in the CF app, as
it's very different from the original "add some logging" patch, which
makes the CF entry rather misleading.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2020-11-09 18:20:40 Re: Disable WAL logging to speed up data loading
Previous Message Stephen Frost 2020-11-09 18:06:44 Re: automatic analyze: readahead - add "IO read time" log message