Re: Skip partition tuple routing with constant partition key

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Subject: Re: Skip partition tuple routing with constant partition key
Date: 2021-06-16 07:27:45
Message-ID: CA+HiwqFHpwrhwdw715Eky7QSPW8kFiR3MiUqeAJkGNRp5GNKbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hou-san,

On Mon, Jun 7, 2021 at 8:38 PM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> Thanks for the patch and explanation, I think you are right that it’s better add
> the rechecking-cached-offset code directly in get_partiiton_for_tuple().
>
> And now, I think maybe it's time to try to optimize the performance.
> Currently, if every row to be inserted in a statement belongs to different
> partition, then the cache check code will bring a slight performance
> degradation(AFAICS: 2% ~ 4%).
>
> So, If we want to solve this, then we may need 1) a reloption to let user control whether use the cache.
> Or, 2) introduce some simple strategy to control whether use cache automatically.
>
> I have not write a patch about 1) reloption, because I think it will be nice if we can
> enable this cache feature by default. So, I attached a WIP patch about approach 2).
>
> The rough idea is to check the average batch number every 1000 rows.
> If the average batch num is greater than 1, then we enable the cache check,
> if not, disable cache check. This is similar to what 0d5f05cde0 did.

Thanks for sharing the idea and writing a patch for it.

I considered a simpler heuristic where we enable/disable caching of a
given offset if it is found by the binary search algorithm at least N
consecutive times. But your idea to check the ratio of the number of
tuples inserted over partition/bound offset changes every N tuples
inserted may be more adaptive.

Please find attached a revised version of your patch, where I tried to
make it a bit easier to follow, hopefully. While doing so, I realized
that caching the bound offset across queries makes little sense now,
so I decided to keep the changes confined to execPartition.c. Do you
have a counter-argument to that?
--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v5-0001-adpative-bound-offset-caching-v5.patch application/octet-stream 8.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-06-16 07:49:43 Re: Error on pgbench logs
Previous Message Michael Paquier 2021-06-16 07:18:23 Re: Different compression methods for FPI