Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date: 2018-02-05 21:45:23
Message-ID: CAH2-WzkXF-G-amB5wXzJorjy8GA3xAjnLuNLFObfQnEbraEgrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 5, 2018 at 1:27 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Feb 5, 2018 at 1:03 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> It certainly is common. In the case of logtape.c, we almost always
>> write out some garbage bytes, even with serial sorts. The only
>> difference here is the *sense* in which they're garbage: they're
>> uninitialized bytes, which Valgrind cares about, rather than byte from
>> previous writes that are left behind in the buffer, which Valgrind
>> does not care about.

I should clarify what I meant here -- it is very common when we have
to freeze a tape, like when we do a serial external randomAccess
tuplesort, or a parallel worker's tuplesort. It shouldn't happen
otherwise. Note that there is a general pattern of dumping out the
current buffer just as the next one is needed, in order to make sure
that the linked list pointer correctly points to the
next/soon-to-be-current block. Note also that the majority of routines
declared within logtape.c can only be used on frozen tapes.

I am pretty confident that I've scoped this correctly by targeting
LogicalTapeFreeze().

> So, I guess another option might be to call VALGRIND_MAKE_MEM_DEFINED
> on the buffer. "We know what we're doing, trust us!"
>
> In some ways, that seems better than inserting a suppression, because
> it only affects the memory in the buffer.

I think that that would also work, and would be simpler, but also
slightly inferior to using the proposed suppression. If there is
garbage in logtape.c buffers, we still generally don't want to do
anything important on the basis of those values. We make one exception
with the suppression, which is a pretty typical kind of exception to
make -- don't worry if we write() poisoned bytes, since those are
bound to be alignment related.

OTOH, as I've said we are generally bound to write some kind of
logtape.c garbage, which will almost certainly not be of the
uninitialized memory variety. So, while I feel that the suppression is
better, the advantage is likely microscopic.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-02-05 21:57:34 Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Pierre Ducroquet 2018-02-05 21:44:07 Re: JIT compiling with LLVM v9.1