postgres_fdw batching vs. (re)creating the tuple slots

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: postgres_fdw batching vs. (re)creating the tuple slots
Date: 2021-05-30 20:22:10
Message-ID: ebbbcc7d-4286-8c28-0272-61b4753af761@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

while looking at the other thread related to postgres_fdw batching [1]
and testing with very large batches, I noticed this disappointing
behavior when inserting 1M rows (just integers, nothing fancy):

no batching: 64782 ms
100 rows: 2118 ms
32767 rows: 41115 ms

Pretty nice improvement when batching 100 rows, but then it all goes
wrong for some reason.

The problem is pretty obvious from a perf profile:

--100.00%--ExecModifyTable
|
--99.70%--ExecInsert
|
|--50.87%--MakeSingleTupleTableSlot
| |
| --50.85%--MakeTupleTableSlot
| |
| --50.70%--IncrTupleDescRefCount
| |
| --50.69%--ResourceOwnerRememberTupleDesc
| |
| --50.69%--ResourceArrayAdd
|
|--48.18%--ExecBatchInsert
| |
| --47.92%--ExecDropSingleTupleTableSlot
| |
| |--47.17%--DecrTupleDescRefCount
| | |
| | --47.15%--ResourceOwnerForgetTupleDesc
| | |
| | --47.14%--ResourceArrayRemove
| |
| --0.53%--ExecClearTuple
|
--0.60%--ExecCopySlot

There are two problems at play, here. Firstly, the way it's coded now
the slots are pretty much re-created for each batch. So with 1M rows and
batches of 32k rows, that's ~30x drop/create. That seems a bit wasteful,
and it shouldn't be too difficult to keep the slots across batches. (We
can't initialize all the slots in advance, because we don't know how
many will be needed, but we don't have to release them between batches.)

The other problem is that ResourceArrayAdd/Remove seem to behave a bit
poorly with very many elements - I'm not sure if it's O(N^2) or worse,
but growing the array and linear searches seem to be a bit expensive.

I'll take a look at fixing the first point, but I'm not entirely sure
how much will that improve the situation.

regards

[1]
https://postgr.es/m/OS0PR01MB571603973C0AC2874AD6BF2594299%40OS0PR01MB5716.jpnprd01.prod.outlook.com

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-05-30 20:34:39 Re: GISTSTATE is too large
Previous Message Andres Freund 2021-05-30 20:12:44 Re: O_DIRECT on macOS