Re: On-demand running query plans using auto_explain and signals

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-02 12:56:03
Message-ID: CAFj8pRCru0iT4WrbrHejVwLg=LK9aSNG4SisA9mUkpwk6fDBSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2015-09-02 12:36 GMT+02:00 Shulgin, Oleksandr <oleksandr(dot)shulgin(at)zalando(dot)de>
:

> On Wed, Sep 2, 2015 at 11:16 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
>>
>>
>> 2015-09-02 11:01 GMT+02:00 Shulgin, Oleksandr <
>> oleksandr(dot)shulgin(at)zalando(dot)de>:
>>
>>> On Tue, Sep 1, 2015 at 7:02 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>>> wrote:
>>>
>>>>
>>>>> But do we really need the slots mechanism? Would it not be OK to just
>>>>> let the LWLock do the sequencing of concurrent requests? Given that we
>>>>> only going to use one message queue per cluster, there's not much
>>>>> concurrency you can gain by introducing slots I believe.
>>>>>
>>>>
>>>> I afraid of problems on production. When you have a queue related to
>>>> any process, then all problems should be off after end of processes. One
>>>> message queue per cluster needs restart cluster when some pathological
>>>> problems are - and you cannot restart cluster in production week, sometimes
>>>> weeks. The slots are more robust.
>>>>
>>>
>>> Yes, but in your implementation the slots themselves don't have a
>>> queue/buffer. Did you intend to have a message queue per slot?
>>>
>>
>> The message queue cannot be reused, so I expect one slot per caller to be
>> used passing parameters, - message queue will be created/released by demand
>> by caller.
>>
>
> I don't believe a message queue cannot really be reused. What would stop
> us from calling shm_mq_create() on the queue struct again?
>

you cannot to change recipient later

>
> To give you an idea, in my current prototype I have only the following
> struct:
>
> typedef struct {
> LWLock *lock;
> /*CmdStatusInfoSlot slots[CMDINFO_SLOTS];*/
> pid_t target_pid;
> pid_t sender_pid;
> int request_type;
> int result_code;
> shm_mq buffer;
> } CmdStatusInfo;
>
> An instance of this is allocated on shared memory once, using BUFFER_SIZE
> of 8k.
>
> In pg_cmdstatus() I lock on the LWLock to check if target_pid is 0, then
> it means nobody else is using this communication channel at the moment. If
> that's the case, I set the pids and request_type and initialize the mq
> buffer. Otherwise I just sleep and retry acquiring the lock (a timeout
> should be added here probably).
>
> What sort of pathological problems are you concerned of? The
>>> communicating backends should just detach from the message queue properly
>>> and have some timeout configured to prevent deadlocks. Other than that, I
>>> don't see how having N slots really help the problem: in case of
>>> pathological problems you will just deplete them all sooner or later.
>>>
>>
>> I afraid of unexpected problems :) - any part of signal handling or
>> multiprocess communication is fragile. Slots are simple and simply attached
>> to any process without necessity to alloc/free some memory.
>>
>
> Yes, but do slots solve the actual problem? If there is only one message
> queue, you still have the same problem regardless of the number of slots
> you decide to have.
>
> --
> Alex
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shulgin, Oleksandr 2015-09-02 13:00:40 Re: On-demand running query plans using auto_explain and signals
Previous Message Etsuro Fujita 2015-09-02 12:55:42 Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)