Re: PATCH: Batch/pipelining support for libpq

From: "Prabakaran, Vaishnavi" <VaishnaviP(at)fast(dot)au(dot)fujitsu(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Dmitry Igrishin <dmitigr(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Manuel Kniep <m(dot)kniep(at)web(dot)de>, "fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, "Iwata, Aya" <iwata(dot)aya(at)jp(dot)fujitsu(dot)com>
Subject: Re: PATCH: Batch/pipelining support for libpq
Date: 2017-02-17 05:17:03
Message-ID: 659ED6D14D438843BD4CDEDAB3F78C7A9338D9B1@SYD1217
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22 November 2016 at 18:32, Craig Ringer<craig(at)2ndquadrant(dot)com> wrote:
> On 22 November 2016 at 15:14, Haribabu Kommi
> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 18, 2016 at 7:18 PM, Craig Ringer <craig(at)2ndquadrant(dot)com>
> wrote:
> >>
> >> The latest is what's attached upthread and what's in the git repo
> >> also referenced upthread.
> >>
> >> I haven't been able to update in response to more recent review due
> >> to other development commitments. At this point I doubt I'll be able
> >> to get update it again in time for v10, so anyone who wants to adopt
> >> it is welcome.
> >
> >
> > Currently patch status is marked as "returned with feedback" in the
> > 11-2016 commitfest. Anyone who wants to work on it can submit the
> > updated patch by taking care of all review comments and change the
> > status of the patch at any time.
> >
> > Thanks for the patch.
>
> Thanks. Sorry I haven't had time to work on it. Priorities.
Hi,
I am interested in this patch and addressed various below comments from reviewers. And, I have separated out code and test module into 2 patches. So, If needed, test patch can be enhanced more, meanwhile code patch can be committed.

>Renaming and refactoring new APIs
> +PQisInBatchMode 172
>+PQqueriesInBatch 173
>+PQbeginBatchMode 174
>+PQendBatchMode 175
>+PQsendEndBatch 176
>+PQgetNextQuery 177
>+PQbatchIsAborted 178
>This set of routines is a bit inconsistent. Why not just prefixing them with PQbatch? Like that for example:
> PQbatchStatus(): consists of disabled/inactive/none, active, error. This covers both PQbatchIsAborted() and PQisInBatchMode().
>PQbatchBegin()
>PQbatchEnd()
>PQbatchQueueSync() or PQbatchQueueFlush, same as PQsendEndBatch() to add and process a sync message into the queue.

Renamed and modified batch status APIs as below
PQisInBatchMode & PQbatchIsAborted ==> PQbatchStatus
PQqueriesInBatch ==> PQbatchQueueCount
PQbeginBatchMode ==> PQbatchBegin
PQendBatchMode ==> PQbatchEnd
PQsendEndBatch ==> PQbatchQueueSync
PQgetNextQuery ==> PQbatchQueueProcess

>PQbatchQueueCount(): returns N>0 if there are N entries, 0 if empty,-1 on failure
>PQbatchQueueProcess(): returns 1 if process can begin, 0 if not, -1 on failure (OOM)
I think it is still ok to keep the current behaviour like other ones present in the same file. E.g:"PQsendPrepare" "PQsendQueryGuts"

>PQqueriesInBatch() (Newname(NN):PQbatchQueueCount)doesn't work as documented.
>It says:
>"Returns the number of queries still in the queue for this batch"
>but in fact it's implemented as a Boolean.
Modified the logic to count number of entries in pending queue and return the count

>The changes in src/test/examples/ are not necessary anymore. You moved all the tests to test_libpq (for the best actually).
Removed these unnecessary changes from src/test/examples folder and corrected the path mentioned in comments section of testlibpqbatch.c

> + while (queue != NULL)
>+ {
> PGcommandQueueEntry *prev = queue;
>+ queue = queue->next;
>+ free(prev);
>+ }
>This should free prev->query.
Both prev->query and prev is freed. Also, this applies to "cmd_queue_recycle" too.

>Running directly make check in src/test/modules/test_libpq/ does not work
Modified "check" rule in makefile

>You could just remove the VERBOSE flag in the tests, having a test more talkative is always better.
Removed ifdef VERBOSE checks.

>But with the libpq batch API, maybe this could be modernized
>with meta-commands like this:
>\startbatch
>...
>\endbatch
I think it is a separate patch candidate.

> It is possible to guess each one of those errors(occurred in PQbatchQueueProcess API) with respectively
> PQgetResult == NULL, PQisInBatchMode() and PQqueriesInBatch().
> Definitely it should be mentioned in the docs that it is possible to make a difference between all those states.
Updated documentation section of PQbatchQueueProcess() with these details.

> + entry = PQmakePipelinedCommand(conn);
>+ entry->queryclass = PGQUERY_SYNC;
>+ entry->query = NULL;
>PQmakePipelinedCommand() returns NULL, and boom.
Corrected to return false if PQmakePipelinedCommand() returns NULL.

> + bool in_batch; /* connection is in batch (pipelined) mode */
>+ bool batch_aborted; /* current batch is aborted, discarding until next Sync */
>Having only one flag would be fine. batch_aborted is set of used only
>when in_batch is used, so both have a strong link
Yes, agree that tracking the batch status via one flag is more clean
So, Added new enum typedef enum
{
PQBATCH_MODE_OFF,
PQBATCH_MODE_ON,
PQBATCH_MODE_ABORTED
} PQBatchStatus;
and " PQBatchStatus batch_status" member of pg_conn is used to track the status of batch mode.

>/* OK, it's launched! */
>- conn->asyncStatus = PGASYNC_BUSY;
>+ if (conn->in_batch)
>+ PQappendPipelinedCommand(conn, pipeCmd);
>+ else
>+ conn->asyncStatus = PGASYNC_BUSY;
>No, it is put in the queue
Though it is put in queue, technically the command is sent to server and server might have started executing it.
So it is ok to use launched I think.

> It may be a good idea to add a test for COPY and trigger a failure.
Added new test - "copy_failure" to trigger failures during copy state.
Please note that COPY is allowed in batch mode, but only syncing batch/queuing any command while copy is in progress is blocked due to potential failure of entire batch.
So updated the documentation for more clarity in this case.

>If I read the code correctly, it seems to me that it is possible to enable the batch mode, and then to use PQexec(), PQsendPrepare will
>just happily process queue the command. Shouldn't PQexec() be prevented in batch mode? Similar remark for PQexecParams(),
>PQexecPrepared() PQdescribePrepared and PQprepare(). In short everything calling PQexecFinish().
Check to verify whether the connection is in batch mode or not is already present in PQexecStart().
And, all the functions calling PQexecFinish() will not be allowed in batch mode anyways. So, no new check is needed I think.

> It may be a good idea to check for PG_PROTOCOL_MAJOR < 3 and issue an error for the new routines.
All the APIs which supports asynchronous command processing can be executed only if PG_PROTOCOL_MAJOR >= 3. So, adding it to new routines are not really needed.

> + After entering batch mode the application dispatches requests
>+ using normal asynchronous <application>libpq</application> functions like
>+ <function>PQsendQueryParams</function>,
><function>PQsendPrepare</function>,
>+ etc.
>It may be better to list all the functions here, PQSendQuery
Documentation updated with list of functions - PQsendQueryParams,PQsendPrepare,
PQsendQueryPrepared,PQdescribePortal,PQdescribePrepared,
PQsendDescribePortal,PQsendDescribePrepared.

>1. Typos in document part - diff-typos.txt file contains the typos specified here
>2. git --check is complaining
>3. s/funtionality/functionality/ and s/recognised/recognized/ typos in testlibpqbatch.c file
>4. s/Batching less useful/Batching is less useful in libpq.sgml
>5. + <para>
>+ Much like asynchronous query mode, there is no performance disadvantage to
>+ using batching and pipelining. It somewhat increased client application
>+ complexity and extra caution is required to prevent client/server network
>+ deadlocks, but can offer considerable performance improvements.
>+ </para>
>I would reword that a bit "it increases client application complexity
>and extra caution is required to prevent client/server deadlocks but
>offers considerable performance improvements".
>6. + ("ping time") is high, and when many small operations are being performed in
>Nit: should use <quote> here. Still not quoting it would be fine.
>7. Postgres 10, not 9.6.
Corrected.

Thanks & Regards,
Vaishnavi
Fujitsu Australia
Disclaimer

The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidence and is intended only for the use of the above named addressee. If you are not the intended recipient, you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you have received this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000 or by reply e-mail to the sender and delete the document and all copies thereof.

Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication, it is the receiver’s responsibility to scan all communication and any files attached for computer viruses and other defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whether direct, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may result directly or indirectly from this communication or any files attached.

If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(at)fast(dot)au(dot)fujitsu(dot)com

Attachment Content-Type Size
0001-Pipelining-batch-support-for-libpq-code-v3.patch application/octet-stream 48.3 KB
0002-Pipelining-batch-support-for-libpq-test-v3.patch application/octet-stream 43.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-02-17 05:27:40 Re: Partitioning vs ON CONFLICT
Previous Message Rafia Sabih 2017-02-17 05:05:15 Re: Parallel Index-only scan