Re: Assorted leaks and weirdness in parallel execution

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Assorted leaks and weirdness in parallel execution
Date: 2017-08-31 17:46:07
Message-ID: CA+TgmoZcK2+r+ti08AbOqR+RzGDTf0A=FFpiMNmftsXVUCdmAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 31, 2017 at 11:09 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I complained a couple weeks ago that nodeGatherMerge looked like it
> leaked a lot of memory when commanded to rescan. Attached are three
> proposed patches that, in combination, demonstrably result in zero
> leakage across repeated rescans.

Gosh, thanks for looking into this so deeply. I apologize for all of
the bugs. Also, my ego is taking some severe damage here.

> But it's going to crash and burn someday.

Yeah, ouch.

> (With this patch,
> there are no callers of shm_mq_get_queue(); should we remove that?)

May as well. I can't remember any more why I did shm_mq_detach() that
way; I think there was someplace where I thought that the
shm_mq_handle might not be available. Maybe I'm misremembering, or
perhaps the situation has changed as that code has evolved.

> The last patch fixes the one remaining leak I saw after applying the
> first two patches, namely that execParallel.c leaks the array palloc'd
> by ExecParallelSetupTupleQueues --- just the array storage, not any of
> the shm_mq_handles it points to. The given patch just adds a pfree
> to ExecParallelFinish, but TBH I find this pretty unsatisfactory.
> It seems like a significant modularity violation that execParallel.c
> is responsible for creating those shm_mqs but not for cleaning them up.

Yeah, the correct division of labor between execParallel.c and
nodeGather.c was not entirely clear to me, and I don't pretend that I
got that 100% right.

> (That would make it more difficult to do the early reader destruction
> that nodeGather currently does, but I am not sure we care about that.)

I think the only thing that matters here is -- if we know that we're
not going to read any more tuples from a worker that might still be
generating tuples, it's imperative that we shut it down ASAP.
Otherwise, it's just going to keep cranking them out, wasting
resources unnecessarily. I think this is different than what you're
talking about here, but just wanted to be clear.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2017-08-31 17:52:43 GnuTLS support
Previous Message Andreas Karlsson 2017-08-31 16:22:59 Re: REINDEX CONCURRENTLY 2.0