Re: Parallel Hash take II

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Oleg Golovanov <rentech(at)mail(dot)ru>
Subject: Re: Parallel Hash take II
Date: 2017-11-07 21:32:41
Message-ID: CA+TgmoZ3yB1aE3yxRhJb5zp__7Zpjq5EZU8SQ+Kg7ESWeNsN8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 7, 2017 at 4:01 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> + ResourceOwnerEnlargeFiles(CurrentResourceOwner);
> + ResourceOwnerRememberFile(CurrentResourceOwner, file);
> + VfdCache[file].resowner = CurrentResourceOwner;
>
> So maybe I'm being pedantic here, but wouldn't the right order be to do
> ResourceOwnerEnlargeFiles() *before* creating the file? It's a memory
> allocating operation, so it can fail, which'd leak the file.

That's not pedantic ... that's a very sound criticism. IMHO, anyway.

> diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
> index 4c35ccf65eb..8b91d5a6ebe 100644
> --- a/src/backend/utils/resowner/resowner.c
> +++ b/src/backend/utils/resowner/resowner.c
> @@ -528,16 +528,6 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
> PrintRelCacheLeakWarning(res);
> RelationClose(res);
> }
> -
> - /* Ditto for dynamic shared memory segments */
> - while (ResourceArrayGetAny(&(owner->dsmarr), &foundres))
> - {
> - dsm_segment *res = (dsm_segment *) DatumGetPointer(foundres);
> -
> - if (isCommit)
> - PrintDSMLeakWarning(res);
> - dsm_detach(res);
> - }
> }
> else if (phase == RESOURCE_RELEASE_LOCKS)
> {
> @@ -654,6 +644,16 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
> PrintFileLeakWarning(res);
> FileClose(res);
> }
> +
> + /* Ditto for dynamic shared memory segments */
> + while (ResourceArrayGetAny(&(owner->dsmarr), &foundres))
> + {
> + dsm_segment *res = (dsm_segment *) DatumGetPointer(foundres);
> +
> + if (isCommit)
> + PrintDSMLeakWarning(res);
> + dsm_detach(res);
> + }
> }
>
> Is that entirely unproblematic? Are there any DSM callbacks that rely on
> locks still being held? Please split this part into a separate commit
> with such analysis.

FWIW, I think this change is a really good idea (I recommended it to
Thomas at some stage, I think). The current positioning was decided
by me at a very early stage of parallel query development where I
reasoned as follows (1) the first thing we're going to implement is
going to be parallel quicksort, (2) that's going to allocate a huge
amount of DSM, (3) therefore we should try to free it as early as
possible. However, I now thing that was wrongheaded, and not just
because parallel quicksort didn't turn out to be the first thing we
developed. Memory is the very last resource we should release when
aborting a transaction, because any other resource we have is tracked
using data structures that are stored in memory. Throwing the memory
away before the end therefore makes life very difficult. That's why,
for backend-private memory, we clean up most everything else in
AbortTransaction() and then only destroy memory contexts in
CleanupTransaction(). This change doesn't go as far, but it's in the
same general direction, and I think rightly so. My error was in
thinking that the primary use of memory would be for storing data, but
really it's about where you put your control structures.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-11-07 21:36:16 Re: Fix a typo in dsm_impl.c
Previous Message Peter Geoghegan 2017-11-07 21:32:28 Re: Parallel Hash take II