Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks
Date: 2020-07-07 16:54:45
Message-ID: 20200707165445.6qniqhzydlfsplvw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-07-07 09:44:41 -0400, Tom Lane wrote:
> Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> writes:
> > Firstly, pg_start_backup registers nonexclusive_base_backup_cleanup as
> > on_shmem_exit call back which will
> > add this function to the before_shmem_exit_list array which is
> > supposed to be removed on pg_stop_backup
> > so that we can do the pending cleanup and issue a warning for each
> > pg_start_backup for which we did not call
> > the pg_stop backup. Now, I executed a query for which JIT is picked
> > up, then the the llvm compiler inserts it's
> > own exit callback i.e. llvm_shutdown at the end of
> > before_shmem_exit_list array. Now, I executed pg_stop_backup
> > and call to cancel_before_shmem_exit() is made with the expectation
> > that the nonexclusive_base_backup_cleanup
> > callback is removed from before_shmem_exit_list array.
>
> I'm of the opinion that the JIT code is abusing this mechanism, and the
> right thing to do is fix that.

What are you proposing? For now we could easily enough work around this
by just making it a on_proc_exit() callback, but that doesn't really
change the fundamental issue imo.

> The restriction you propose to remove is not just there out of
> laziness, it's an expectation about what safe use of this mechanism
> would involve. Un-ordered removal of callbacks seems pretty risky: it
> would mean that whatever cleanup is needed is not going to be done in
> LIFO order.

Maybe I am confused, but isn't it <13's pg_start_backup() that's
violating the protocol much more clearly than the JIT code? Given that
it relies on there not being any callbacks registered between two SQL
function calls? I mean, what it does is basically:

1) before_shmem_exit(nonexclusive_base_backup_cleanup...
2) arbitrary code executed for a long time
3) cancel_before_shmem_exit(nonexclusive_base_backup_cleanup...

which pretty obviously can't at all deal with any other
before_shmem_exit callbacks being registered in 2). Won't this be a
problem for every other before_shmem_exit callback that we register
on-demand? Say Async_UnlistenOnExit, RemoveTempRelationsCallback?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-07-07 17:12:16 Re: Default setting for enable_hashagg_disk (hash_mem)
Previous Message Thom Brown 2020-07-07 16:51:55 Re: [HACKERS] Look-behind regular expressions