Re: On-demand running query plans using auto_explain and signals

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-07 09:55:23
Message-ID: CACACo5Rmob7aGP0y9zn8ZUcRWuNnV7swk2ugY9pPcEVziP_3yQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 4, 2015 at 6:11 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
wrote:

> Sorry, but I still don't see how the slots help this issue - could you
>> please elaborate?
>>
> with slot (or some similiar) there is not global locked resource. If I'll
> have a time at weekend I'll try to write some prototype.
>

But you will still lock on the slots list to find an unused one. How is
that substantially different from what I'm doing?

> >> Other smaller issues:
>> >>
>> >> * probably sending line by line is useless - shm_mq_send can pass
>> bigger data when nowait = false
>>
>> I'm not sending it like that because of the message size - I just find it
>> more convenient. If you think it can be problematic, its easy to do this as
>> before, by splitting lines on the receiving side.
>>
> Yes, shm queue sending data immediately - so slicing on sender generates
> more interprocess communication
>

Well, we are talking about hundreds to thousands bytes per plan in total.
And if my reading of shm_mq implementation is correct, if the message fits
into the shared memory buffer, the receiver gets the direct pointer to the
shared memory, no extra allocation/copy to process-local memory. So this
can be actually a win.

> >> * pg_usleep(1000L); - it is related to single point resource
>>
>> But not a highly concurrent one.
>>
> I believe so it is not becessary - waiting (sleeping) can be deeper in
> reading from queue - the code will be cleaner
>

The only way I expect this line to be reached is when a concurrent
pg_cmdstatus() call is in progress: the receiving backend has set the
target_pid and has created the queue, released the lock and now waits to
read something from shm_mq. So the backend that's trying to also use this
communication channel can obtain the lwlock, checks if the channel is not
used at the time, fails and then it needs to check again, but that's going
to put a load on the CPU, so there's a small sleep.

The real problem could be if the process that was signaled to connect to
the message queue never handles the interrupt, and we keep waiting forever
in shm_mq_receive(). We could add a timeout parameter or just let the user
cancel the call: send a cancellation request, use pg_cancel_backend() or
set statement_timeout before running this.

--
Alex

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2015-09-07 10:58:24 Re: Waits monitoring
Previous Message Kyotaro HORIGUCHI 2015-09-07 09:08:03 Re: Foreign join pushdown vs EvalPlanQual