Re: POC: postgres_fdw insert batching

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, 'Craig Ringer' <craig(dot)ringer(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: POC: postgres_fdw insert batching
Date: 2020-11-28 02:10:40
Message-ID: d07fa4a4-f19d-f439-f3ab-23224cf7f6be@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/27/20 7:05 AM, tsunakawa(dot)takay(at)fujitsu(dot)com wrote:
> From: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
>> But in the libpq pipelining patch I demonstrated a 300 times
>> (3000%) performance improvement on a test workload...
>
> Wow, impressive number. I've just seen it in the beginning of the
> libpq pipelining thread (oh, already four years ago..!) Could you
> share the workload and the network latency (ping time)? I'm sorry
> I'm just overlooking it.
>
> Thank you for your (always) concise explanation. I'd like to check
> other DBMSs and your rich references for the FDW interface. (My
> first intuition is that many major DBMSs might not have client C APIs
> that can be used to implement an async pipelining FDW interface.
> Also, I'm afraid it requires major surgery or reform of executor. I
> don't want it to delay the release of reasonably good (10x)
> improvement with the synchronous interface.)
>

I do agree that pipelining is nice, and can bring huge improvements.

However, the FDW interface as it's implemented today is not designed to
allow that, I believe (we pretty much just invoke the FWD callbacks as
if it was a local AM). It assumes the calls are synchronous, and
redesigning it to work in async way is a much larger/complex patch than
what's being discussed here.

I do think the FDW extension proposed here (adding the bulk-insert
callback) is useful in general, for two reasons: (a) even if most client
libraries support some sort of pipelining, some don't, and (b) I'd bet
it's still more efficient to send one large insert than pipelining many
individual inserts.

That being said, I'm against expanding the scope of this patch to also
require redesign of the whole FDW infrastructure - that would likely
mean no such improvement landing in PG14. If the libpq pipelining patch
seems likely to get committed, we can try using it for the bulk insert
callback (instead of the current multi-value stuff).

> (It'd be kind of you to send emails in text format. I've changed the
> format of this reply from HTML to text.)
>

Craig's client is sending messages in both text/plain and text/html. You
probably need to tell your client to prefer that over html, somehow.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-11-28 02:35:57 Re: Improving spin-lock implementation on ARM.
Previous Message Bossart, Nathan 2020-11-28 01:50:54 Re: A few new options for CHECKPOINT