Re: logtape.c stats don't account for unused "prefetched" block numbers

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logtape.c stats don't account for unused "prefetched" block numbers
Date: 2020-09-14 22:54:48
Message-ID: CAH2-WzkCsnWuQwR3Ou8f0cWC-0O9tWWNjT1hiHwj2GS1OYczJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 14, 2020 at 3:20 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> In the comment in the patch, you say:
>
> "In practice this probably doesn't matter because we'll be called after
> the flush anyway, but be tidy."
>
> By which I assume you mean that LogicalTapeRewindForRead() will be
> called before LogicalTapeSetBlocks().

Yeah, I'm pretty sure that that's an equivalent way of expressing the
same idea. It appears that this assumption holds, though only when
we're not using preallocation (i.e. it doesn't necessarily hold for
the HashAggs-that-spill case, as I go into below).

> If that's the intention of LogicalTapeSetBlocks(), should we just make
> it a requirement that there are no open write buffers for any tapes
> when it's called? Then we could just use nBlocksWritten in both cases,
> right?

That does seem appealing. Perhaps it could be enforced by an assertion.

> (Aside: HashAgg calls it before LogicalTapeRewindForRead(). That might
> be a mistake in HashAgg where it will keep the write buffers around
> longer than necessary. If I recall correctly, it was my intention to
> rewind for reading immediately after the batch was finished, which is
> why I made the read buffer lazily-allocated.)

If I add the assertion described above and run the regression tests,
it fails within "select_distinct" (and at other points). This is the
specific code:

--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -1284,6 +1284,7 @@ LogicalTapeSetBlocks(LogicalTapeSet *lts)
* (In practice this probably doesn't matter because we'll be called after
* the flush anyway, but be tidy.)
*/
+ Assert(lts->nBlocksWritten == lts->nBlocksAllocated);
if (lts->enable_prealloc)
return lts->nBlocksWritten;

Maybe the LogicalTapeRewindForRead() inconsistency you mention could
be fixed, which would enable the simplification you suggested. What do
you think?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-09-14 22:57:32 Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)
Previous Message Peter Geoghegan 2020-09-14 22:39:08 Re: logtape.c stats don't account for unused "prefetched" block numbers