Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date: 2018-02-03 03:26:59
Message-ID: CAH2-WznUFEW8sWEYVD40vFRF4Px6FEZSf7iSJuQh5NY_8AN6Yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 2, 2018 at 4:31 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Fri, Feb 2, 2018 at 1:58 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> Not saying you're wrong, but you should include a comment on why this is
>> a benign warning. Presumably it's some padding memory somewhere, but
>> it's not obvious from the above bleat.
>
> Sure. This looks slightly more complicated than first anticipated, but
> I'll keep everyone posted.

I couldn't make up my mind if it was best to prevent the uninitialized
write(), or to instead just add a suppression. I eventually decided
upon the suppression -- see attached patch. My proposed commit message
has a full explanation of the Valgrind issue, which I won't repeat
here. Go read it before reading the rest of this e-mail.

It might seem like my suppression is overly broad, or not broad
enough, since it essentially targets LogicalTapeFreeze(). I don't
think it is, though, because this can occur in two places within
LogicalTapeFreeze() -- it can occur in the place we actually saw the
issue on lousyjack, from the ltsReadBlock() call within
LogicalTapeFreeze(), as well as a second place -- when
BufFileExportShared() is called. I found that you have to tweak code
to prevent it happening in the first place before you'll see it happen
in the second place. I see no point in actually playing whack-a-mole
for a totally benign issue like this, though, which made me finally
decide upon the suppression approach.

Bear in mind that a third way of fixing this would be to allocate
logtape.c buffers using palloc0() rather than palloc() (though I don't
like that idea at all). For serial external sorts, the logtape.c
buffers are guaranteed to have been written to/initialized at least
once as part of spilling a sort to disk. Parallel external sorts don't
quite guarantee that, which is why we run into this Valgrind issue.

--
Peter Geoghegan

Attachment Content-Type Size
0001-Add-logtape.c-Valgrind-suppression.patch text/x-patch 1.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-02-03 03:46:07 Re: pie-in-sky idea: 'sensitive' function parameters
Previous Message Jeff Davis 2018-02-03 02:21:12 Re: JIT compiling with LLVM v9.1