Re: PostgreSQL crashes with SIGSEGV

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Aleksandr Parfenov <a(dot)parfenov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PostgreSQL crashes with SIGSEGV
Date: 2018-01-17 20:19:50
Message-ID: CAH2-WzmAKo=KoBJrTB0M7UUnM-L=Gycs_=iT8+Z2YohmxEbFWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Jan 17, 2018 at 11:23 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Aleksandr Parfenov <a(dot)parfenov(at)postgrespro(dot)ru> writes:
>> The new status of this patch is: Ready for Committer
>
> I don't feel particularly comfortable committing a patch that
> was clearly labeled as a rushed draft by its author.
> Peter, where do you stand on this work?

I would like to take another pass over
WIP-tuplesort-memcontext-fix.patch, to be on the safe side. I'm
currently up to my neck in parallel CREATE INDEX work, though, and
would prefer to avoid context switching for a week or two, if
possible. How time sensitive do you think this is?

I think we'll end up doing this:

* Committing the minimal modifications made (in both WIP patches) to
tuplesort_getdatum() to both v10 and master branches.
tuplesort_getdatum() must follow the example of
tuplesort_gettupleslot() on these branches, since
tuplesort_gettupleslot() already manages to get everything right in
recent releases. (There is no known tuplesort_getdatum() crash on
these versions, but this still seems necessary on general principle.)

* Committing something pretty close to
WIP-tuplesort-memcontext-fix.patch to 9.6.

* Committing another fix to 9.5. This fix will apply the same
principles as WIP-tuplesort-memcontext-fix.patch, but will be simpler
mechanically, since the whole batch memory mechanism added to
tuplesort.c for 9.6 isn't involved.

I'm not sure whether or not we should also apply this
still-to-be-written 9.5 patch to 9.4 and 9.3, since those versions
don't have grouping sets, and so cannot crash. ISTM that we should
leave them alone, since tuplesort has had this problem forever.

Perhaps you should go ahead and commit a patch with just the changes
to tuplesort_getdatum() now. This part seems low risk, and worth doing
in a single, (almost) consistent pass over the back branches. This
part is a trivial backpatch, and could be thought of as an independent
problem. (It will be nice to get v10 and master branches completely
out of the way quickly.)

> In a quick look at the patches, WIP-kludge-fix.patch seems clearly
> unacceptable for back-patching because it changes the signature and
> behavior of ExecResetTupleTable, which external code might well be using.

The signature change isn't required, and it was silly of me to add it.
But I also really dislike the general approach it takes, and mostly
posted it because I thought that it was a useful counterpoint.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-01-17 20:31:22 Re: PostgreSQL crashes with SIGSEGV
Previous Message Sergei Kornilov 2018-01-17 19:26:15 Re: Index-only scan returns incorrect results when using a composite GIST index with a gist_trgm_ops column.

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-01-17 20:29:11 Re: [HACKERS][PATCH] Applying PMDK to WAL operations for persistent memory
Previous Message Geoff Winkless 2018-01-17 20:18:46 Re: proposal: alternative psql commands quit and exit