Re: On-demand running query plans using auto_explain and signals

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-02 10:36:20
Message-ID: CACACo5SR1OJz3F-fJJQq1_DcqK+xBDHnbaZ+D5QVrcHScBQr_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 2, 2015 at 11:16 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
wrote:

>
>
> 2015-09-02 11:01 GMT+02:00 Shulgin, Oleksandr <
> oleksandr(dot)shulgin(at)zalando(dot)de>:
>
>> On Tue, Sep 1, 2015 at 7:02 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>>
>>>
>>>> But do we really need the slots mechanism? Would it not be OK to just
>>>> let the LWLock do the sequencing of concurrent requests? Given that we
>>>> only going to use one message queue per cluster, there's not much
>>>> concurrency you can gain by introducing slots I believe.
>>>>
>>>
>>> I afraid of problems on production. When you have a queue related to any
>>> process, then all problems should be off after end of processes. One
>>> message queue per cluster needs restart cluster when some pathological
>>> problems are - and you cannot restart cluster in production week, sometimes
>>> weeks. The slots are more robust.
>>>
>>
>> Yes, but in your implementation the slots themselves don't have a
>> queue/buffer. Did you intend to have a message queue per slot?
>>
>
> The message queue cannot be reused, so I expect one slot per caller to be
> used passing parameters, - message queue will be created/released by demand
> by caller.
>

I don't believe a message queue cannot really be reused. What would stop
us from calling shm_mq_create() on the queue struct again?

To give you an idea, in my current prototype I have only the following
struct:

typedef struct {
LWLock *lock;
/*CmdStatusInfoSlot slots[CMDINFO_SLOTS];*/
pid_t target_pid;
pid_t sender_pid;
int request_type;
int result_code;
shm_mq buffer;
} CmdStatusInfo;

An instance of this is allocated on shared memory once, using BUFFER_SIZE
of 8k.

In pg_cmdstatus() I lock on the LWLock to check if target_pid is 0, then it
means nobody else is using this communication channel at the moment. If
that's the case, I set the pids and request_type and initialize the mq
buffer. Otherwise I just sleep and retry acquiring the lock (a timeout
should be added here probably).

What sort of pathological problems are you concerned of? The communicating
>> backends should just detach from the message queue properly and have some
>> timeout configured to prevent deadlocks. Other than that, I don't see how
>> having N slots really help the problem: in case of pathological problems
>> you will just deplete them all sooner or later.
>>
>
> I afraid of unexpected problems :) - any part of signal handling or
> multiprocess communication is fragile. Slots are simple and simply attached
> to any process without necessity to alloc/free some memory.
>

Yes, but do slots solve the actual problem? If there is only one message
queue, you still have the same problem regardless of the number of slots
you decide to have.

--
Alex

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2015-09-02 10:44:25 Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: HEAD seems to generate larger WAL regarding GIN index
Previous Message Amit Langote 2015-09-02 10:25:21 Re: Horizontal scalability/sharding