Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-04-10 08:56:19
Message-ID: 0ed58efc-3320-909f-faf8-1413762986b0@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/04/10 13:27, Ashutosh Bapat wrote:
> On Mon, Apr 9, 2018 at 8:56 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Fri, Apr 6, 2018 at 11:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>>>> Sounds like you're saying that if we have too many alternative files
>>>> then there's a chance that one could pass by luck.
>>>
>>> Yeah, exactly: it passed, but did it pass for the right reason?
>>>
>>> If there's just two expected-files, it's likely not a big problem,
>>> but if you have a bunch it's something to worry about.
>>>
>>> I'm also wondering how come we had hash partitioning before and
>>> did not have this sort of problem. Is it just that we added a
>>> new test that's more sensitive to the details of the hashing
>>> (if so, could it be made less so)? Or is there actually more
>>> platform dependence now than before (and if so, why is that)?
>>
>> The existing hash partitioning tests did have some dependencies on the
>> hash function, but they took care not to use the built-in hash
>> functions. Instead they did stuff like this:
>>
>> CREATE OR REPLACE FUNCTION hashint4_noop(int4, int8) RETURNS int8 AS
>> $$SELECT coalesce($1,0)::int8$$ LANGUAGE sql IMMUTABLE;
>> CREATE OPERATOR CLASS test_int4_ops FOR TYPE int4 USING HASH AS
>> OPERATOR 1 = , FUNCTION 2 hashint4_noop(int4, int8);
>> CREATE TABLE mchash (a int, b text, c jsonb)
>> PARTITION BY HASH (a test_int4_ops, b test_text_ops);
>>
>> I think that this approach should also be used for the new tests.
>> Variant expected output files are a pain to maintain, and you
>> basically just have to take whatever output you get as the right
>> answer, because nobody knows what output a certain built-in hash
>> function should produce for a given input except by running the code.
>> If you do the kind of thing shown above, though, then you can easily
>> see by inspection that you're getting the right answer.

Thanks for the idea. I think it makes sense and also agree that alternate
outputs approach is not perfectly reliable and maintainable.

> +1.

Attached find a patch that rewrites hash partition pruning tests that
away. It creates two hash operator classes, one for int4 and another for
text type and uses them to create hash partitioned table to be used in the
tests, like done in the existing tests in hash_part.sql. Since that makes
tests (hopefully) reliably return the same result always, I no longer see
the need to keep them in a separate partition_prune_hash.sql. The
reasoning behind having the separate file was to keep the alternative
output file small as David explained in [1].

However, I noticed that there is a bug in RelationBuildPartitionKey that
causes a crash when using a SQL function as partition support function as
the revised tests do, so please refer to and apply the patches I posted
here before running the revised tests:

https://www.postgresql.org/message-id/3041e853-b1dd-a0c6-ff21-7cc5633bffd0%40lab.ntt.co.jp

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAKJS1f-SON_hAekqoV4_WQwJBtJ_rvvSe68jRNhuYcXqQ8PoQg%40mail.gmail.com

Attachment Content-Type Size
v1-0001-Rewrite-hash-partition-pruning-tests-to-use-custo.patch text/plain 29.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-04-10 09:03:39 Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.
Previous Message Huong Dangminh 2018-04-10 08:30:19 power() function in Windows: "value out of range: underflow"