Re: PATCH: Batch/pipelining support for libpq

From: Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: Batch/pipelining support for libpq
Date: 2020-08-31 16:05:47
Message-ID: CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

It seems like this patch is nearly finished. I fixed all the remaining
issues. I'm also asking
a confirmation of the test scenarios you want to see in the next
version of the patch.

> Hi,
>
> On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote:
> > Totally unasked for, here's a rebase of this patch series. I didn't do
> > anything other than rebasing to current master, solving a couple of very
> > trivial conflicts, fixing some whitespace complaints by git apply, and
> > running tests to verify everthing works.
> >
> > I don't foresee working on this at all, so if anyone is interested in
> > seeing this feature in, I encourage them to read and address
> > Horiguchi-san's feedback.
>
> Nor am I planning to do so, but I do think its a pretty important
> improvement.
>
>
Fixed

>
>
> > +/*
> > + * PQrecyclePipelinedCommand
> > + * Push a command queue entry onto the freelist. It must be a dangling entry
> > + * with null next pointer and not referenced by any other entry's next pointer.
> > + */
>
> Dangling sounds a bit like it's already freed.
>
>
Fixed

>
>
> > +/*
> > + * PQbatchSendQueue
> > + * End a batch submission by sending a protocol sync. The connection will
> > + * remain in batch mode and unavailable for new synchronous command execution
> > + * functions until all results from the batch are processed by the client.
>
> I feel like the reference to the protocol sync is a bit too low level
> for an external API. It should first document what the function does
> from a user's POV.
>
> I think it'd also be good to document whether / whether not queries can
> already have been sent before PQbatchSendQueue is called or not.
>
Fixed

>
>
>
> > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != PGQUERY_SYNC)
> > + {
> > + /*
> > + * In an aborted batch we don't get anything from the server for each
> > + * result; we're just discarding input until we get to the next sync
> > + * from the server. The client needs to know its queries got aborted
> > + * so we create a fake PGresult to return immediately from
> > + * PQgetResult.
> > + */
> > + conn->result = PQmakeEmptyPGresult(conn,
> > + PGRES_BATCH_ABORTED);
> > + if (!conn->result)
> > + {
> > + printfPQExpBuffer(&conn->errorMessage,
> > + libpq_gettext("out of memory"));
> > + pqSaveErrorResult(conn);
> > + return 0;
>
> Is there any way an application can recover at this point? ISTM we'd be
> stuck in the previous asyncStatus, no?
>

conn->result is null when malloc(sizeof(PGresult)) returns null. It's
very unlikely unless
the server machine is out of memory, so the server will probably be
unresponsive anyway.

I'm leaving this as it is but if anyone has a solution simple to
implement I'll fix it.

>
>
> > +/* pqBatchFlush
> > + * In batch mode, data will be flushed only when the out buffer reaches the threshold value.
> > + * In non-batch mode, data will be flushed all the time.
> > + */
> > +static int
> > +pqBatchFlush(PGconn *conn)
> > +{
> > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= OUTBUFFER_THRESHOLD))
> > + return(pqFlush(conn));
> > + return 0; /* Just to keep compiler quiet */
> > +}
>
> unnecessarily long line.
>
Fixed

>
> > +/*
> > + * Connection's outbuffer threshold is set to 64k as it is safe
> > + * in Windows as per comments in pqSendSome() API.
> > + */
> > +#define OUTBUFFER_THRESHOLD 65536
>
> I don't think the comment explains much. It's fine to send more than 64k
> with pqSendSome(), they'll just be send with separate pgsecure_write()
> invocations. And only on windows.
>
> It clearly makes sense to start sending out data at a certain
> granularity to avoid needing unnecessary amounts of memory, and to make
> more efficient use of latency / serer side compute.
>
> It's not implausible that 64k is the right amount for that, I just don't
> think the explanation above is good.
>

Fixed

> > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c b/src/test/modules/test_libpq/testlibpqbatch.c
> > new file mode 100644
> > index 0000000000..4d6ba266e5
> > --- /dev/null
> > +++ b/src/test/modules/test_libpq/testlibpqbatch.c
> > @@ -0,0 +1,1456 @@
> > +/*
> > + * src/test/modules/test_libpq/testlibpqbatch.c
> > + *
> > + *
> > + * testlibpqbatch.c
> > + * Test of batch execution functionality
> > + */
> > +
> > +#ifdef WIN32
> > +#include <windows.h>
> > +#endif
>
> ISTM that this shouldn't be needed in a test program like this?
> Shouldn't libpq abstract all of this away?
>

Fixed.

>
> > +static void
> > +simple_batch(PGconn *conn)
> > +{
>
> ISTM that all or at least several of these should include tests of
> transactional behaviour with pipelining (e.g. using both implicit and
> explicit transactions inside a single batch, using transactions across
> batches, multiple explicit transactions inside a batch).
>

@Andres, just to make sure I understood, here is the test scenarios I'll add:

Implicit and explicit multiple transactions:
start batch:
create_table
insert X
begin transaction
insert X
commit transaction
begin transaction
insert Y
commit transaction
end batch

Transaction across batches:
start batch:
create_table
begin transaction
insert X
end batch
start batch:
insert Y
commit transaction
end batch

>
>
> > + PGresult *res = NULL;
> > + const char *dummy_params[1] = {"1"};
> > + Oid dummy_param_oids[1] = {INT4OID};
> > +
> > + fprintf(stderr, "simple batch... ");
> > + fflush(stderr);
>
> Why do we need fflush()? IMO that shouldn't be needed in a use like
> this? Especially not on stderr, which ought to be unbuffered?
>
Removed.

>
> > + /*
> > + * Enter batch mode and dispatch a set of operations, which we'll then
> > + * process the results of as they come in.
> > + *
> > + * For a simple case we should be able to do this without interim
> > + * processing of results since our out buffer will give us enough slush to
> > + * work with and we won't block on sending. So blocking mode is fine.
> > + */
> > + if (PQisnonblocking(conn))
> > + {
> > + fprintf(stderr, "Expected blocking connection mode\n");
> > + goto fail;
> > + }
>
> Perhaps worth adding a helper for this?
>
> #define EXPECT(condition, explanation) \
>

Fixed (and saved 600 lines :).

Once I get your confirmation of the test scenarios, I'll implement
them and share another patch.

Attachment Content-Type Size
libpq-batch-v19.patch application/x-patch 86.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-08-31 16:14:55 Re: Ideas about a better API for postgres_fdw remote estimates
Previous Message Ranier Vilela 2020-08-31 16:02:41 Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior