Re: [idea] table partition + hash join

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [idea] table partition + hash join
Date: 2015-06-10 12:20:34
Message-ID: CA+HiwqGPyMWEicgZf0b1C4SFqPBTsnWN=kGQpndO0Cw_Obd7aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 10, 2015 at 8:33 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>> On 2015-06-10 PM 01:42, Kouhei Kaigai wrote:
>> >
>> > Let's assume a table which is partitioned to four portions,
>> > and individual child relations have constraint by hash-value
>> > of its ID field.
>> >
>> > tbl_parent
>> > + tbl_child_0 ... CHECK(hash_func(id) % 4 = 0)
>> > + tbl_child_1 ... CHECK(hash_func(id) % 4 = 1)
>> > + tbl_child_2 ... CHECK(hash_func(id) % 4 = 2)
>> > + tbl_child_3 ... CHECK(hash_func(id) % 4 = 3)
>> >
>> > If someone tried to join another relation with tbl_parent
>> > using equivalence condition, like X = tbl_parent.ID, we
>> > know inner tuples that does not satisfies the condition
>> > hash_func(X) % 4 = 0
>> > shall be never joined to the tuples in tbl_child_0.
>> > So, we can omit to load these tuples to inner hash table
>> > preliminary, then it potentially allows to split the
>> > inner hash-table.
>> >
>>
>> Unless I am missing something (of your idea or how hash join works), it seems
>> that there is no way to apply the filter qual (hash_func(X) % 4 = 0, etc.) at
>> the Hash node. So, that qual would not be able to limit what gets into the
>> inner hash table, right? Perhaps the qual needs to be pushed all the way down
>> to the Hash's underlying scan if that makes sense.
>>
> Really? It seems to me just below of the ExecProcNode() in MultiExecHash()
> is my expected location to filter out obviously unmatched tuples.
> As long as we can construct a qualifier based on CHECK() constraint
> of the other side, ExecQual() makes a decision whether fetched tuple
> should be loaded to inner hash-table, or not.
>

Ah that's an idea. I was thinking of unmodified MultiExecHash().

Thanks,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-06-10 12:25:40 Re: pg_archivecleanup bug (invalid filename input)
Previous Message Prakash Itnal 2015-06-10 12:06:39 Auto-vacuum is not running in 9.1.12