Re: PATCH: Batch/pipelining support for libpq

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Manuel Kniep <m(dot)kniep(at)web(dot)de>, "fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: PATCH: Batch/pipelining support for libpq
Date: 2016-05-24 01:20:43
Message-ID: CAMsr+YEMCXXcZHE4FjH9FZDfqPZNfVZV2drHAs0wn9hFzda=Kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 May 2016 at 00:00, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:

> On Mon, May 23, 2016 at 8:50 AM, Andres Freund <andres(at)anarazel(dot)de>
> wrote:
>
>> This should be very useful for optimising FDWs, Postgres-XC, etc.
> >
> > And optimizing normal clients.
> >
> > Not easy, but I'd be very curious how much psql's performance improves
> > when replaying a .sql style dump, and replaying from a !tty fd, after
> > hacking it up to use the batch API.
>

I didn't, but agree it'd be interesting. So would pg_restore, for that
matter, though its use of COPY for the bulk of its work means it wouldn't
make tons of difference.

I think it'd be safe to enable it automatically in psql's
--single-transaction mode. It's also safe to send anything after an
explicit BEGIN and until the next COMMIT as a batch from libpq, and since
it parses the SQL enough to detect statement boundaries already that
shouldn't be too hard to handle.

However: psql is synchronous, using the PQexec family of blocking calls.
It's all fairly well contained in SendQuery and PSQLexec, but making it use
batching still require restructuring those to use the asynchronous
nonblocking API and append the query to a pending-list, plus the addition
of a select() loop to handle results and dispatch more work. MainLoop()
isn't structured around a select or poll, it loops over gets. So while it'd
be interesting to see what difference batching made the changes to make
psql use it would be a bit more invasive. Far from a rewrite, but to avoid
lots of code duplication it'd have to change everything to use nonblocking
mode and a select loop, which is a big change for such a core tool.

This is a bit of a side-project and I've got to get back to "real work" so
I don't expect to do a proper patch for psql any time soon. I'd rather not
try to build too much on this until it's seen some review and I know the
API won't need a drastic rewrite anyway. I'll see if I can do a hacked-up
version one evening to see what it does for performance though.

Did you consider the use of simple_list.c instead of introducing a new
> mimic as PGcommandQueueEntry? It would be cool avoiding adding new
> list emulations on frontends.
>

Nope. I didn't realise it was there; I've done very little on the C client
and library side so far. So thanks, I'll update it accordingly.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-05-24 01:25:31 Re: LSN as a recovery target
Previous Message Michael Paquier 2016-05-24 01:18:23 Re: Typo in 001_initdb.pl