Re: PATCH: Batch/pipelining support for libpq

From: Shay Rojansky <roji(at)roji(dot)org>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Manuel Kniep <m(dot)kniep(at)web(dot)de>, "fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: PATCH: Batch/pipelining support for libpq
Date: 2016-10-12 11:51:47
Message-ID: CADT4RqA6XoDCVY-G13ME1oRVshE2oNk4fRHKZC0K-jJymQJV0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all. I thought I'd share some experience from Npgsql regarding
batching/pipelining - hope this isn't off-topic.

Npgsql has supported batching for quite a while, similar to what this patch
proposes - with a single Sync message is sent at the end.

It has recently come to my attention that this implementation is
problematic because it forces the batch to occur within a transaction; in
other words, there's no option for a non-transactional batch. This can be a
problem for several reasons: users may want to sent off a batch of inserts,
not caring whether one of them fails (e.g. because of a unique constraint
violation). In other words, in some scenarios it may be appropriate for
later batched statements to be executed when an earlier batched statement
raised an error. If Sync is only sent at the very end, this isn't possible.
Another example of a problem (which actually happened) is that transactions
acquire row-level locks, and so may trigger deadlocks if two different
batches update the same rows in reverse order. Both of these issues
wouldn't occur if the batch weren't implicitly batched.

My current plan is to modify the batch implementation based on whether
we're in an (explicit) transaction or not. If we're in a transaction, then
it makes perfect sense to send a single Sync at the end as is being
proposed here - any failure would cause the transaction to fail anyway, so
skipping all subsequent statements until the batch's end makes sense.
However, if we're not in an explicit transaction, I plan to insert a Sync
message after each individual Execute, making non-transactional batched
statements more or less identical in behavior to non-transactional
unbatched statements. Note that this mean that a batch can generate
multiple errors, not just one.

I'm sharing this since it may be relevant to the libpq batching
implementation as well, and also to get any feedback regarding how Npgsql
should act.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2016-10-12 12:08:25 Re: FTS Configuration option
Previous Message Aleksander Alekseev 2016-10-12 11:15:27 [PATCH] pg_filedump is broken