Re: A potential memory leak on Merge Join when Sort node is not below Materialize node

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: A potential memory leak on Merge Join when Sort node is not below Materialize node
Date: 2022-09-28 19:47:55
Message-ID: CAApHDvqS6wC5U==k9Hd26E4EQXH3QR67-T4=Q1rQ36NGvjfVSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for investigating this and finding the guilty commit.

On Thu, 29 Sept 2022 at 07:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> After looking at that for a little while, I wonder if we shouldn't
> fix this by restricting the Datum-sort path to be used only with
> pass-by-value data types. That'd require only a minor addition
> to the new logic in ExecInitSort.

I'm also wondering if that's the best fix given the timing of this discovery.

> The alternative of inserting a pfree of the old value would complicate
> the code nontrivially, I think, and really it would necessitate a
> complete performance re-test. I'm wondering if the claimed speedup
> for pass-by-ref types wasn't fictional and based on skipping the
> required pfrees. Besides, if you think this code is hot enough that
> you don't want to add a test-and-branch per tuple (a claim I also
> doubt, BTW) then you probably don't want to add such overhead into
> the pass-by-value case where the speedup is clear.

I'm wondering if the best way to fix it if doing it that way would be
to invent tuplesort_getdatum_nocopy() which would be the same as
tuplesort_getdatum() except it wouldn't do the datumCopy for byref
types. It looks like tuplesort_gettupleslot() when copy==false just
directly stores the MinimalTuple that's in stup.tuple and shouldFree
is set to false.

Going by [1], it looks like I saw gains in test 6, which was a byref
Datum. Skipping the datumCopy() I imagine could only make the gains
slightly higher on that. That puts me a bit more on the fence about
the best fix for PG15.

I've attached a patch to restrict the optimisation to byval types in
the meantime.

David

[1] https://www.postgresql.org/message-id/CAApHDvrWV%3Dv0qKsC9_BHqhCn9TusrNvCaZDz77StCO--fmgbKA%40mail.gmail.com

Attachment Content-Type Size
datum_sort_fixes.patch text/plain 1.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-09-28 19:49:36 Re: longfin and tamandua aren't too happy but I'm not sure why
Previous Message Andres Freund 2022-09-28 19:33:17 Re: meson vs mingw, plpython, readline and other fun