Re: [Sender Address Forgery]Re: [Sender Address Forgery]Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Sender Address Forgery]Re: [Sender Address Forgery]Re: [HACKERS] path toward faster partition pruning
Date: 2018-01-17 06:14:01
Message-ID: CA+HiwqHOMt-8+gtNJxEmdNVWH8VJieZK9gov21_MqiLRU_DcZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi David.

On Wed, Jan 17, 2018 at 12:32 PM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On 16 January 2018 at 21:08, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> On 2018/01/12 12:30, David Rowley wrote:
>>> 8. The code in get_partitions_from_ne_clauses() does perform quite a
>>> few nested loops. I think a more simple way to would be to track the
>>> offsets you've seen in a Bitmapset. This would save you having to
>>> check for duplicates, as an offset can only contain a single datum.
>>> You'd just need to build a couple of arrays after that, one to sum up
>>> the offsets found per partition, and one for the total datums allowed
>>> in the partition. If the numbers match then you can remove the
>>> partition.
>>>
>>> I've written this and attached it to this email. It saves about 50
>>> lines of code and should perform much better for complex cases, for
>>> example, a large NOT IN list. This also implements #6.
>>
>> I liked your patch, so incorporated it, except, I feel slightly
>> uncomfortable about the new name that you chose for the function because
>> it sounds a bit generic.
>
> You're right. I only renamed it because I inverted the meaning of the
> function in the patch. It no longer did
> "get_partitions_from_ne_clauses", it did the opposite and give the
> partitions which can't match. Please feel free to think of a new
> better name. Is "get_partitions_excluded_by_ne_clauses" too long?
>
>>> I think you quite often use "the same" to mean "it". Can you change that?
>>
>> I guess that's just one of my many odd habits when writing English, all of
>> which I'm trying to get rid of, but apparently with limited success. Must
>> try harder. :)
>
> Oops, on re-reading that it sounded as though I was asking you to
> change some habit, but I just meant the comments. I understand there
> will be places that use English where that's normal. It's just I don't
> recall seeing that in PostgreSQL code before.

No worries, I too slightly misread what you'd said.

When I double checked, I too couldn't find "the same" used the way as
I did in the patch. So I actually ended up finding and replacing more
"the same"s with "it" than you had pointed out in your review in the
latest v20 patch.

> American English is
> pretty much the standard for the project, despite that not always
> being strictly applied (e.g we have a command called ANALYSE which is
> an alias for ANALYZE). I always try and do my best to spell words in
> American English (which is not where I'm from), which for me stretches
> about as far as putting 'z' in the place of some of my 's'es.

I see.

>>> 15. "the latter" is normally used when you're referring to the last
>>> thing in a list which was just mentioned. In this case, leftarg_const
>>> and rightarg_const is the list, so "the latter" should mean
>>> rightarg_const, but I think you mean to compare them using the
>>> operator.
>>>
>>> * If the leftarg_const and rightarg_const are both of the type expected
>>> * by op's operator, then compare them using the latter.
>>
>> Rewrote it as:
>>
>> * We can compare leftarg_const and rightarg_const using op's operator
>> * only if both are of the type expected by it.
>
> I'd probably write "expected type." instead of "type expected by it."

OK, will do.

>>> 17. The following example will cause get_partitions_for_keys_hash to misbehave:
>>>
>>> create table hashp (a int, b int) partition by hash (a, b);
>>> create table hashp1 partition of hashp for values with (modulus 4, remainder 0);
>>> create table hashp2 partition of hashp for values with (modulus 4, remainder 1);
>>> create table hashp3 partition of hashp for values with (modulus 4, remainder 3);
>>> create table hashp4 partition of hashp for values with (modulus 4, remainder 2);
>>> explain select * from hashp where a = 1 and a is null;

[ ... ]

>>> The above code will need to be made smarter. It'll likely crash if you
>>> change "b" to a pass-by-ref type.
>>
>> Hmm, not sure why. It seems to work:
>
> Yeah, works now because you've added new code to test for
> contradictions in the quals, e.g a = 1 and a is null is now rejected
> as constfalse.

Oh, I see. I thought you were talking of it as an independent issue.

Thanks,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2018-01-17 06:23:48 Re: pgsql: Centralize json and jsonb handling of datetime types
Previous Message Amit Langote 2018-01-17 06:03:24 Re: TOAST table created for partitioned tables