Re: [HACKERS] path toward faster partition pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-03-20 21:29:07
Message-ID: CA+TgmobKJfmBUDYMVHCR=nqq1Qx4BTm+wC2=G5huxqkoO7f7EA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 20, 2018 at 7:07 AM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2018/03/16 21:55, Amit Langote wrote:
>> Attached updated patches.
>
> Attached is further revised version.
>
> Of note is getting rid of PartitionPruneContext usage in the static
> functions of partprune.c. Most of the code there ought to only run during
> planning, so it can access the necessary information from RelOptInfo
> directly instead of copying it to PartitionPruneContext and then passing
> it around.

+ if (rte->relkind == RELKIND_PARTITIONED_TABLE)
+ {
+ if (rel->baserestrictinfo != NIL)
+ {
+ live_children = prune_append_rel_partitions(root, rel);
+ did_pruning = true;
+ }
+ }

Use &&

+ case COMBINE_OR:
+ {

Won't survive pgindent, which currently produces a *massive* diff for
these patches.

+ /*
+ * XXX- The following ad-hoc method of pruning only works for list
+ * partitioning. It checks for each partition if all of its
+ * accepted values appear in ne_datums[].
+ */

So why are we doing it this way? How about doing something not
ad-hoc? I tried to propose that before.

+ * Set *value to the constant value obtained by evaluating 'expr'
+ *
+ * Note that we may not be able to evaluate the input expression, in which
+ * case, the function returns false to indicate that *value has not been
+ * set. True is returned otherwise.

These comments need updating, since this function (laudibly) no longer
does any evaluating. I wonder how this will work for run-time
pruning, though.

+ if (context->partopcintype[partkeyidx] != exprTyp)
+ {
+ Oid new_supfuncid;
+ int16 procnum;
+
+
+ procnum = (context->strategy == PARTITION_STRATEGY_HASH)
+ ? HASHEXTENDED_PROC
+ : BTORDER_PROC;
+ new_supfuncid = get_opfamily_proc(context->partopfamily[partkeyidx],
+ context->partopcintype[partkeyidx],
+ exprTyp, procnum);
+ fmgr_info(new_supfuncid, &context->partsupfunc[partkeyidx]);
+ }

What's the point of this, exactly? Leftover dead code, maybe?

+ * Input:
+ * See the comments above the definition of PartScanKeyInfo to see what
+ * kind of information is contained in 'keys'.

There's no such thing as PartScanKeyInfo any more and the function has
no argument called 'keys'. None of the functions actual arguments are
explained.

+ /*
+ * If there are multiple pruning steps, we perform them one after another,
+ * passing the result of one step as input to another. Based on the type
+ * of pruning step, perform_pruning_step may add or remove partitions from
+ * the set of partitions it receives as the input.
+ */

The comment sounds great, but the code doesn't work that way; it
always calls bms_int_members to intersect the new result with any
previous result. I'm baffled as to how this manages to DTRT if
COMBINE_OR is used. In general I had hoped that the list of pruning
steps was something over which we were only going to iterate, not
recurse. This definitely recurses for the combine steps, but it's
still (sorta) got the idea of a list of iterable steps. That's a
weird mix.

+ if (nvalues == context->partnatts)
+ {
+ greatest_modulus = get_greatest_modulus(boundinfo);
+ rowHash = compute_hash_value(partnatts, partsupfunc, values,
+ isnull);
+ result_index = partindices[rowHash % greatest_modulus];
+ if (result_index >= 0)
+ return bms_make_singleton(result_index);
+ }
+ else
+ /* Can't do pruning otherwise, so return all partitions. */
+ return bms_add_range(NULL, 0, context->nparts - 1);

Wouldn't we want to (1) arrange things so that this function is never
called if nvalues < context->partnatts && context->strategy ==
PARTITION_STRATEGY_HASH or at least (2) avoid constructing isnull from
nullkeys if we're not going to use it?

Also, shouldn't we be sanity-checking the strategy number here?

I'm out of time for right now but it looks to me like this patch still
needs quite a bit of fine-tuning.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julian Markwort 2018-03-20 21:44:21 Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)
Previous Message Merlin Moncure 2018-03-20 21:28:04 Re: INOUT parameters in procedures