Re: Multi-Column List Partitioning

From: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
To: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multi-Column List Partitioning
Date: 2021-05-07 14:05:51
Message-ID: CAOgcT0MspLZrf_03Zfp5ed_aaBR0e-mmOA+23fOTc85Fb3GHRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> While reviewing one of the 'Table partitioning' related patches,
> I found that Postgres does not support multiple column based LIST
> partitioning. Based on this understanding, I have started working on
> this feature. I also feel that 'Multi-Column List Partitioning' can
> be benefited to the Postgres users in future.

+1 for the feature. I also think this can help users deal with some
useful cases.

> CREATE TABLE t2_1 PARTITION OF t2 FOR VALUES IN (1, 2), (1, 5), (2,
> 2),(2, 10);

IMHO, listing every single tuple like this might be a bit cumbersome for
the user. What about something like this:

...FOR VALUES IN (1, 2, 3, 4), (11, 22, 33, 44), where the first set
is the list for values of column A and second list is for column B. We
can treat these lists as A X B possible values or simply (a1, b1), (a2,
b2) internally. However I see other proprietary databases already have
syntax something similar that you are proposing here. So, I leave it
open for the thoughts from experts. Also, though what I propose might be
easy from a user perspective, but might not be that easy for
implementation, given that for a larger number of columns in partition list
e.g. A X B X C X D lists become unmanageable.

I did not review the patch in detail, but a quick look at it leaves me
with following comments:

1.
> + * list. Then this function will continue the serach and return the
index of
Typo:
s/serach/search

2.
A compiler warning:
partprune.c: In function ‘get_matching_list_bounds’:
partprune.c:2731:20: error: passing argument 5 of ‘partition_list_bsearch’
makes pointer from integer without a cast [-Werror=int-conversion]
2731 | nvalues, value, &is_equal);
| ^~~~~
| |
| Datum {aka long unsigned int}
In file included from partprune.c:53:
../../../src/include/partitioning/partbounds.h:120:32: note: expected
‘Datum *’ {aka ‘long unsigned int *’} but argument is of type ‘Datum’ {aka
‘long unsigned int’}
120 | int nvalues, Datum *value, bool *is_equal);
| ~~~~~~~^~~~~

3.
And, a server crash with following case:
postgres=# CREATE TABLE t1 (a int) PARTITION BY LIST (a);
CREATE TABLE
postgres=# CREATE TABLE t1p1 PARTITION OF t1 FOR VALUES IN (1, 2, 3);
CREATE TABLE
postgres=# \d+ t1p1
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?>

Stacktrace:
(gdb) bt
#0 __GI_raise (sig=sig(at)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f5d273c5859 in __GI_abort () at abort.c:79
#2 0x000055779d2eb69d in ExceptionalCondition
(conditionName=0x55779d4978d8 "ptr == NULL || nodeTag(ptr) == type",
errorType=0x55779d4978c3 "FailedAssertion",
fileName=0x55779d4978a0 "../../../src/include/nodes/nodes.h",
lineNumber=603) at assert.c:69
#3 0x000055779d03a684 in castNodeImpl (type=T_Const, ptr=0x55779e457b18)
at ../../../src/include/nodes/nodes.h:603
#4 0x000055779d04368a in get_qual_for_list (parent=0x7f5d1df829b8,
spec=0x55779e457950) at partbounds.c:4155
#5 0x000055779d03ac60 in get_qual_from_partbound (rel=0x7f5d1df82570,
parent=0x7f5d1df829b8, spec=0x55779e457950) at partbounds.c:272
#6 0x000055779d2cf630 in generate_partition_qual (rel=0x7f5d1df82570) at
partcache.c:379
#7 0x000055779d2cf468 in get_partition_qual_relid (relid=32771) at
partcache.c:308
#8 0x000055779d2592bf in pg_get_partition_constraintdef
(fcinfo=0x55779e44ee50) at ruleutils.c:2019
#9 0x000055779cec7221 in ExecInterpExpr (state=0x55779e44dfb0,
econtext=0x55779e407fe8, isnull=0x7ffddf9b109f) at execExprInterp.c:744
#10 0x000055779cec954f in ExecInterpExprStillValid (state=0x55779e44dfb0,
econtext=0x55779e407fe8, isNull=0x7ffddf9b109f) at execExprInterp.c:1819
#11 0x000055779cf1d58a in ExecEvalExprSwitchContext (state=0x55779e44dfb0,
econtext=0x55779e407fe8, isNull=0x7ffddf9b109f)
at ../../../src/include/executor/executor.h:338
#12 0x000055779cf1d602 in ExecProject (projInfo=0x55779e44dfa8) at
../../../src/include/executor/executor.h:372
#13 0x000055779cf1db2f in ExecNestLoop (pstate=0x55779e407ed0) at
nodeNestloop.c:241
#14 0x000055779cedf136 in ExecProcNodeFirst (node=0x55779e407ed0) at
execProcnode.c:462
#15 0x000055779ced3053 in ExecProcNode (node=0x55779e407ed0) at
../../../src/include/executor/executor.h:257
#16 0x000055779ced5a87 in ExecutePlan (estate=0x55779e407c80,
planstate=0x55779e407ed0, use_parallel_mode=false, operation=CMD_SELECT,
sendTuples=true, numberTuples=0,
direction=ForwardScanDirection, dest=0x55779e425a88, execute_once=true)
at execMain.c:1551
#17 0x000055779ced372d in standard_ExecutorRun (queryDesc=0x55779e453520,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:361
#18 0x000055779ced353c in ExecutorRun (queryDesc=0x55779e453520,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:305
#19 0x000055779d13d287 in PortalRunSelect (portal=0x55779e398800,
forward=true, count=0, dest=0x55779e425a88) at pquery.c:912
#20 0x000055779d13cec0 in PortalRun (portal=0x55779e398800,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x55779e425a88, altdest=0x55779e425a88,
qc=0x7ffddf9b14f0) at pquery.c:756
#21 0x000055779d1361ce in exec_simple_query (
query_string=0x55779e3367a0 "SELECT inhparent::pg_catalog.regclass,\n
pg_catalog.pg_get_expr(c.relpartbound, c.oid),\n inhdetachpending,\n
pg_catalog.pg_get_partition_constraintdef(c.oid)\nFROM pg_catalog.pg_class
c JOIN pg_catalo"...) at postgres.c:1214
#22 0x000055779d13ad8b in PostgresMain (argc=1, argv=0x7ffddf9b1710,
dbname=0x55779e3626f8 "postgres", username=0x55779e3626d8 "hadoop") at
postgres.c:4476
#23 0x000055779d0674d3 in BackendRun (port=0x55779e358380) at
postmaster.c:4488
#24 0x000055779d066d8c in BackendStartup (port=0x55779e358380) at
postmaster.c:4210
#25 0x000055779d062f9b in ServerLoop () at postmaster.c:1742
#26 0x000055779d062734 in PostmasterMain (argc=3, argv=0x55779e3308b0) at
postmaster.c:1414
#27 0x000055779cf5805f in main (argc=3, argv=0x55779e3308b0) at main.c:209

Regards,
Jeevan Ladhe

On Thu, May 6, 2021 at 7:33 PM Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
wrote:

> Hi,
>
> While reviewing one of the 'Table partitioning' related patches, I found
> that Postgres does not support multiple column based LIST partitioning.
> Based on this understanding, I have started working on this feature. I also
> feel that 'Multi-Column List Partitioning' can be benefited to the Postgres
> users in future.
>
> I am attaching the WIP patch for this feature here. It supports
> 'Multi-Column List Partitioning', however some tasks are still pending. I
> would like to know your thoughts about this, So that I can continue the
> work with improvising the current patch.
>
> Following things are handled in the patch.
> 1. Syntax
>
> CREATE TABLE table_name (attrs) PARTITION BY LIST(list_of_columns);
>
> Earlier there was no provision to mention multiple columns as part of the
> 'list_of_columns' clause. Now we can mention the list of columns separated
> by comma.
>
> CREATE TABLE table_name_p1 PARTITION OF table_name FOR VALUES IN
> list_of_values.
>
> Whereas list_of_columns can be
> a. (value [,...])
> b. (value [,...]) [,...]
>
> I would like to list a few examples here for better understanding.
> Ex-1:
> CREATE TABLE t1(a int) PARTITION BY LIST(a);
> CREATE TABLE t1_1 PARTITION OF t1 FOR VALUES IN (1, 2, 10, 5, 7);
>
> Ex-2:
> CREATE TABLE t2(a int, b int) PARTITION BY LIST(a,b);
> CREATE TABLE t2_1 PARTITION OF t2 FOR VALUES IN (1, 2), (1, 5), (2, 2),(2,
> 10);
>
> Please share if any changes are required in the above syntax.
>
> 2. Modified transformation logic to support above syntax.
>
> 3. Modified the data structures to store the information caused by above
> syntax. Also modified the searching logic to route the tuple to the
> appropriate partition.
>
> 4. Done a few basic testing and verified CREATE TABLE, INSERT INTO and
> SELECT are working fine.
>
>
> Following items are pending and I am working on it.
>
> 1. Handling of 'NULL' values.
>
> 2. Support multi column case in partition pruning.
>
> 3. Add test cases to the regression test suite.
>
> Please share your thoughts.
>
>
> Thanks & Regards,
> Nitin Jadhav
>
>
>
>
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-05-07 14:29:58 Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
Previous Message Yura Sokolov 2021-05-07 13:18:40 Re: Use simplehash.h instead of dynahash in SMgr