RE: Skip partition tuple routing with constant partition key

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Skip partition tuple routing with constant partition key
Date: 2021-05-24 01:31:44
Message-ID: OS0PR01MB5716D6702C8C8060CC17292694269@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Amit Langote <amitlangote09(at)gmail(dot)com>
Sent: Thursday, May 20, 2021 8:23 PM
>
> Hou-san,
>
> On Thu, May 20, 2021 at 7:35 PM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > 2) Test expression in partition key
> >
> > create or replace function partition_func(i int) returns int as $$
> > begin
> > return i;
> > end;
> > $$ language plpgsql immutable parallel restricted; create unlogged
> > table foo (a int) partition by range (partition_func(a));
> >
> > SQL: insert into foo select 1 from generate_series(1, 10000000);
> >
> > HEAD
> > Time: 8595.120 ms (00:08.595)
> >
> > AFTER PATCH(skip constant partition key)
> > Time: 4198.421 ms (00:04.198)
> >
> > AFTER PATCH(cache the last partition)
> > Time: 12829.800 ms (00:12.830)
> >
> > If add a user defined function in the partition key, it seems have
> > performance degradation after the patch.
>
> Oops.
>
> > I did some analysis on it, for the above testcase , ExecPartitionCheck
> > executed three expression 1) key is null 2) key > low 3) key < top In
> > this case, the "key" contains a funcexpr and the funcexpr will be
> > executed three times for each row, so, it bring extra overhead which cause
> the performance degradation.
> >
> > IMO, improving the ExecPartitionCheck seems a better solution to it,
> > we can Calculate the key value in advance and use the value to do the bound
> check.
> > Thoughts ?
>
> This one seems bit tough. ExecPartitionCheck() uses the generic expression
> evaluation machinery like a black box, which means execPartition.c can't really
> tweal/control the time spent evaluating partition constraints. Given that, we
> may have to disable the caching when key->partexprs != NIL, unless we can
> reasonably do what you are suggesting.[]

I did some research on the CHECK expression that ExecPartitionCheck() execute.
Currently for a normal RANGE partition key it will first generate a CHECK expression
like : [Keyexpression IS NOT NULL AND Keyexpression > lowboud AND Keyexpression < lowboud].
In this case, Keyexpression will be re-executed which will bring some overhead.

Instead, I think we can try to do the following step:
1)extract the Keyexpression from the CHECK expression
2)evaluate the key expression in advance
3)pass the result of key expression to do the partition CHECK.
In this way ,we only execute the key expression once which looks more efficient.

Attaching a POC patch about this approach.
I did some performance test with my laptop for this patch:

------------------------------------test cheap partition key expression

create unlogged table test_partitioned_inner (a int) partition by range ((abs(a) + a/50));
create unlogged table test_partitioned_inner_1 partition of test_partitioned_inner for values from (1) to (50);
create unlogged table test_partitioned_inner_2 partition of test_partitioned_inner for values from ( 50 ) to (100);
insert into test_partitioned_inner_1 select (i%48)+1 from generate_series(1,10000000,1) t(i);

BEFORE patch:
Execution Time: 6120.706 ms

AFTER patch:
Execution Time: 5705.967 ms

------------------------------------test expensive partition key expression
create or replace function partfunc(i int) returns int as
$$
begin
return i;
end;
$$ language plpgsql IMMUTABLE;

create unlogged table test_partitioned_inner (a int) partition by range (partfunc (a));
create unlogged table test_partitioned_inner_1 partition of test_partitioned_inner for values from (1) to (50);
create unlogged table test_partitioned_inner_2 partition of test_partitioned_inner for values from ( 50 ) to (100);

I think this can be a independent improvement for partitioncheck.

before patch:
Execution Time: 14048.551 ms

after patch:
Execution Time: 8810.518 ms

I think this patch can solve the performance degradation of key expression
after applying the [Save the last partition] patch.
Besides, this could be a separate patch which can improve some more cases.
Thoughts ?

Best regards,
houzj

Attachment Content-Type Size
0001-improving-ExecPartitionCheck.patch application/octet-stream 25.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-05-24 01:34:29 Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
Previous Message Michael Paquier 2021-05-24 01:29:19 Re: Force disable of SSL renegociation in the server