Re: On-demand running query plans using auto_explain and signals

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-18 09:25:24
Message-ID: CAFj8pRAENjXoLQp+_nRZXuqzn7kJRx7YUScD6O=yczmi1CU3qQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2015-09-18 10:59 GMT+02:00 Shulgin, Oleksandr <oleksandr(dot)shulgin(at)zalando(dot)de>
:

> On Thu, Sep 17, 2015 at 10:13 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
>
>> On Thu, Sep 17, 2015 at 11:16 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>>
>> >> Second, using a shm_mq manipulates the state of the process latch. I
>> >> don't think you can make the assumption that it's safe to reset the
>> >> process latch at any and every place where we check for interrupts.
>> >> For example, suppose the process is already using a shm_mq and the
>> >> CHECK_FOR_INTERRUPTS() call inside that code then discovers that
>> >> somebody has activated this mechanism and you now go try to send and
>> >> receive from a new shm_mq. But even if that and every other
>> >> CHECK_FOR_INTERRUPTS() in the code can tolerate a process latch reset
>> >> today, it's a new coding rule that could easily trip people up in the
>> >> future.
>> >
>> > It is valid, and probably most important. But if we introduce own
>> mechanism,
>> > we will play with process latch too (although we can use LWlocks)
>>
>> With the design I proposed, there is zero need to touch the process
>> latch, which is good, because I'm pretty sure that is going to be a
>> problem. I don't think there is any need to use LWLocks here either.
>> When you get a request for data, you can just publish a DSM segment
>> with the data and that's it. Why do you need anything more? You
>> could set the requestor's latch if it's convenient; that wouldn't be a
>> problem. But the process supplying the data can't end up in a
>> different state than it was before supplying that data, or stuff WILL
>> break.
>>
>
> There is still the whole problem of where exactly the backend being
> queried for the status should publish that DSM segment and when to free it?
>
> If it's a location shared between all backends, there should be locking
> around it. Probably this is not a big problem, if you don't expect all the
> backends start querying each other rapidly. That is how it was implemented
> in the first versions of this patch actually.
>
> If we take the per-backend slot approach the locking seems unnecessary and
> there are principally two options:
>
> 1) The backend puts the DSM handle in its own slot and notifies the
> requester to read it.
> 2) The backend puts the DSM handle in the slot of the requester (and
> notifies it).
>
> If we go with the first option, the backend that has created the DSM will
> not know when it's OK to free it, so this has to be responsibility of the
> requester. If the latter exits before reading and freeing the DSM, we have
> a leak. Even bigger is the problem that the sender backend can no longer
> send responses to a number of concurrent requestors: if its slot is
> occupied by a DSM handle, it can not send a reply to another backend until
> the slot is freed.
>
> With the second option we have all the same problems with not knowing when
> to free the DSM and potentially leaking it, but we can handle concurrent
> requests.
>

It should not be true - the data sender create DSM and fills it. Then set
caller slot and send signal to caller. Caller can free DSM any time,
because data sender send newer touch it.

>
> The current approach where the requester creates and frees the DSM doesn't
> suffer from these problems, so if we pre-allocate the segment just big
> enough we can avoid the use of shm_mq. That will take another GUC for the
> segment size. Certainly no one expects a query plan to weigh a bloody
> megabyte, but this is what happens to Pavel apparently.
>

It is plan C - last variant from my view. Any other GUC :(

Pavel

>
> --
> Alex
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2015-09-18 09:56:50 Re: checkpointer continuous flushing
Previous Message Kouhei Kaigai 2015-09-18 09:24:38 Re: numbering plan nodes