RE: POC: postgres_fdw insert batching

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Tomas Vondra' <tomas(dot)vondra(at)enterprisedb(dot)com>, 'Craig Ringer' <craig(dot)ringer(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: RE: POC: postgres_fdw insert batching
Date: 2020-11-26 01:48:08
Message-ID: TYAPR01MB299073839F82E8F162A5C178FEF90@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
> Well, good that we all agree this is a useful feature to have (in
> general). The question is whether postgres_fdw should be doing batching
> on it's onw (per this thread) or rely on some other feature (libpq
> pipelining). I haven't followed the other thread, so I don't have an
> opinion on that.

Well, as someone said in this thread, I think bulk insert is much more common than updates/deletes. Thus, major DBMSs have INSERT VALUES(record1), (record2)... and INSERT SELECT. Oracle has direct path INSERT in addition. As for the comparison of INSERT with multiple records and libpq batching (= multiple INSERTs), I think the former is more efficient because the amount of data transfer is less and the parsing-planning of INSERT for each record is eliminated.

I never deny the usefulness of libpq batch/pipelining, but I'm not sure if app developers would really use it. If they want to reduce the client-server round-trips, won't they use traditional stored procedures? Yes, the stored procedure language is very DBMS-specific. Then, I'd like to know what kind of well-known applications are using standard batching API like JDBC's batch updates. (Sorry, I think that should be discussed in libpq batch/pipelining thread and this thread should not be polluted.)

> Note however we're doing two things here, actually - we're implementing
> custom batching for postgres_fdw, but we're also extending the FDW API
> to allow other implementations do the same thing. And most of them won't
> be able to rely on the connection library providing that, I believe.

I'm afraid so, too. Then, postgres_fdw would be an example that other FDW developers would look at when they use INSERT with multiple records.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-11-26 01:54:01 Re: Multi Inserts in CREATE TABLE AS - revived patch
Previous Message Fujii Masao 2020-11-26 01:47:56 Re: walsender bug: stuck during shutdown