Re: automatic analyze: readahead - add "IO read time" log message

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: automatic analyze: readahead - add "IO read time" log message
Date: 2021-03-12 00:11:56
Message-ID: 20210312001156.GT20766@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tomas Vondra (tomas(dot)vondra(at)enterprisedb(dot)com) wrote:
> On 3/8/21 8:42 PM, Stephen Frost wrote:
> > * Tomas Vondra (tomas(dot)vondra(at)enterprisedb(dot)com) wrote:
> >> On 2/10/21 11:10 PM, Stephen Frost wrote:
> >>> * Heikki Linnakangas (hlinnaka(at)iki(dot)fi) wrote:
> >>>> On 05/02/2021 23:22, Stephen Frost wrote:
> >>>>> Unless there's anything else on this, I'll commit these sometime next
> >>>>> week.
> >>>>
> >>>> One more thing: Instead of using 'effective_io_concurrency' GUC directly,
> >>>> should call get_tablespace_maintenance_io_concurrency().
> >>>
> >>> Ah, yeah, of course.
> >>>
> >>> Updated patch attached.
> >>
> >> A couple minor comments:
> >>
> >> 1) I think the patch should be split into two parts, one adding the
> >> track_io_timing, one adding the prefetching.
> >
> > This was already done..
>
> Not sure what you mean by "done"? I see the patch still does both
> changes related to track_io_timing and prefetching.

They are two patches..

➜ ~ grep Subject analyze_prefetch_v8.patch
Subject: [PATCH 1/2] Improve logging of auto-vacuum and auto-analyze
Subject: [PATCH 2/2] Use pre-fetching for ANALYZE

The first doesn't have any prefetch-related things, the second doesn't
have any track_io_timing things..

> >> 3) Is there a way to reduce the amount of #ifdef in acquire_sample_rows?
> >
> > Perhaps..
> >
> >> This makes the code rather hard to read, IMHO. It seems to me we can
> >> move the code around a bit and merge some of the #ifdef blocks - see the
> >> attached patch. Most of this is fairly trivial, with the exception of
> >> moving PrefetchBuffer before table_scan_analyze_next_block - AFAIK this
> >> does not materially change the behavior, but perhaps I'm wrong.
> >
> > but I don't particularly like doing the prefetch right before we run
> > vacuum_delay_point() and potentially sleep.
>
> Why? Is that just a matter of personal preference (fair enough) or is
> there a reason why that would be wrong/harmful?

Telling the kernel "hey, we're about to need this, please go get it" and
then immediately going to sleep just strikes me as a bit off. We should
be trying to minimize the time between prefetch and actual request for
the page. Of course, there's going to be some times that we will issue
a prefetch and then come around again and end up hitting the limit and
sleeping before we actually request the page and maybe it doesn't
ultimately matter much but it just seems better to sleep first before
issuing the prefetch to minimize the amount 'outstanding' when we do end
up sleeping. One thing about prefetching is that the kernel is
certainly within its rights to decide to drop the page before we
actually go to read it, if it's under pressure and we're just sleeping.

> I think e.g. prefetch_targblock could be moved to the next #ifdef, which
> will eliminate the one-line ifdef.

Sure, done in the attached.

Thanks for the review! Unless there's other comments, I'll plan to push
this over the weekend or early next week.

Stephen

Attachment Content-Type Size
analyze_prefetch_v9.patch text/x-diff 13.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-03-12 00:23:12 Re: shared-memory based stats collector
Previous Message Thomas Munro 2021-03-11 23:27:45 Re: [Patch] Optimize dropping of relation buffers using dlist