Re: automatic analyze: readahead - add "IO read time" log message

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: automatic analyze: readahead - add "IO read time" log message
Date: 2021-01-26 10:23:29
Message-ID: b154ffb5-82b0-3d0b-631a-d7223d1fc6fd@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13/01/2021 23:17, Stephen Frost wrote:
> Would be great to get a review / comments from others as to if there's
> any concerns. I'll admit that it seems reasonably straight-forward to
> me, but hey, I wrote most of it, so that's not really a fair
> assessment... ;)

Look good overall. A few minor comments:

The patch consists of two part: add stats to the log for auto-analyze,
and implement prefetching. They seem like independent features, consider
splitting into two patches.

It's a bit weird that you get more stats in the log for
autovacuum/autoanalyze than you get with VACUUM/ANALYZE VERBOSE. Not
really this patch's fault though.

This conflicts with the patch at
https://commitfest.postgresql.org/31/2907/, to refactor the table AM
analyze API. That's OK, it's straightforward to resolve regardless of
which patch is committed first.

> /* Outer loop over blocks to sample */
> while (BlockSampler_HasMore(&bs))
> {
> #ifdef USE_PREFETCH
> BlockNumber prefetch_targblock = InvalidBlockNumber;
> #endif
> BlockNumber targblock = BlockSampler_Next(&bs);
>
> #ifdef USE_PREFETCH
>
> /*
> * Make sure that every time the main BlockSampler is moved forward
> * that our prefetch BlockSampler also gets moved forward, so that we
> * always stay out ahead.
> */
> if (BlockSampler_HasMore(&prefetch_bs))
> prefetch_targblock = BlockSampler_Next(&prefetch_bs);
> #endif
>
> vacuum_delay_point();
>
> if (!table_scan_analyze_next_block(scan, targblock, vac_strategy))
> continue;
>
> #ifdef USE_PREFETCH
>
> /*
> * When pre-fetching, after we get a block, tell the kernel about the
> * next one we will want, if there's any left.
> */
> if (effective_io_concurrency && prefetch_targblock != InvalidBlockNumber)
> PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, prefetch_targblock);
> #endif
>
> while (table_scan_analyze_next_tuple(scan, OldestXmin, &liverows, &deadrows, slot))
> {
> ...
> }
>
> pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE,
> ++blksdone);
> }

If effective_io_concurrency == 0, this calls
BlockSampler_Next(&prefetch_bs) anyway, which is a waste of cycles.

If table_scan_analyze_next_block() returns false, we skip the
PrefetchBuffer() call. That seem wrong.

Is there any potential harm from calling PrefetchBuffer() on a page that
table_scan_analyze_next_block() later deems as unsuitable for smapling
and returns false? That's theoretical at the moment, because
heapam_scan_analyze_next_block() always returns true. (The tableam
ANALYZE API refactor patch will make this moot, as it moves this logic
into the tableam's implementation, so the implementation can do whatever
make sense for the particular AM.)

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hou, Zhijie 2021-01-26 10:29:49 RE: Parallel INSERT (INTO ... SELECT ...)
Previous Message Laurenz Albe 2021-01-26 10:05:53 Re: Error on failed COMMIT