Re: Trouble with hashagg spill I/O pattern and costing

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Trouble with hashagg spill I/O pattern and costing
Date: 2020-05-26 19:15:11
Message-ID: 20200526191511.d7pwz4awrvdm4p6j@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 26, 2020 at 11:40:07AM -0700, Jeff Davis wrote:
>On Tue, 2020-05-26 at 16:15 +0200, Tomas Vondra wrote:
>> I'm not familiar with logtape internals but IIRC the blocks are
>> linked
>> by each block having a pointer to the prev/next block, which means we
>> can't prefetch more than one block ahead I think. But maybe I'm
>> wrong,
>> or maybe fetching even just one block ahead would help ...
>
>We'd have to get creative. Keeping a directory in the LogicalTape
>structure might work, but I'm worried the memory requirements would be
>too high.
>
>One idea is to add a "prefetch block" to the TapeBlockTrailer (perhaps
>only in the forward direction?). We could modify the prealloc list so
>that we always know the next K blocks that will be allocated to the
>tape. All for v14, of course, but I'd be happy to hack together a
>prototype to collect data.
>

Yeah. I agree prefetching is definitely out of v13 scope. It might be
interesting to try how useful would it be, if you're willing to spend
some time on a prototype.

>
>Do you have any other thoughts on the current prealloc patch for v13,
>or is it about ready for commit?
>

I think it's pretty much ready to go.

I have some some doubts about the maximum value (128 probably means
read-ahead values above 256 are probably pointless, although I have not
tested that). But it's still a huge improvement with 128, so let's get
that committed.

I've been thinking about actually computing the expected number of
blocks per tape, and tying the maximum to that, somehow. But that's
something we can look at in the future.

As for the tlist fix, I think that's mostly ready too - the one thing we
should do is probably only doing it for AGG_HASHED. For AGG_SORTED it's
not really necessary.

iregards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-05-26 19:17:10 Re: hash join error improvement (old)
Previous Message David G. Johnston 2020-05-26 18:50:50 Re: PG_CRON logging