Re: [POC] hash partitioning

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: amul sul <sulamul(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-05-17 13:24:46
Message-ID: CAFjFpRfx--aMMAbWMUs-G5+P=DpxckGVoqEYkegOBczUQf3okg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 17, 2017 at 2:07 PM, amul sul <sulamul(at)gmail(dot)com> wrote:

>
>> In partition_bounds_equal(), please add comments explaining why is it safe to
>> check just the indexes? May be we should add code under assertion to make sure
>> that the datums are equal as well.
>
> Added assert in the attached version.
>
>> The comment could be something
>> like, "If two partitioned tables have different greatest moduli, their
>> partition schemes don't match. If they have same greatest moduli, and
>> all remainders have different indexes, they all have same modulus
>> specified and the partitions are ordered by remainders, thus indexes
>> array will be an identity i.e. index[i] = i. If the partition
>> corresponding to a given remainder exists, it will have same index
>> entry for both partitioned tables or if it's missing it will be -1.
>> Thus if indexes array matches, corresponding datums array matches. If
>> there are multiple remainders corresponding to a given partition,
>> their partitions are ordered by the lowest of the remainders, thus if
>> indexes array matches, both of the tables have same indexes arrays, in
>> both the tables remainders corresponding to multiple partitions all
>> have same indexes and thus same modulus. Thus again if the indexes are
>> same, datums are same.".
>>
>
> Thanks, added with minor modification.

I have reworded this slightly better. See the attached patch as diff of 0002.

>
>> In the same function
>> if (key->strategy == PARTITION_STRATEGY_HASH)
>> {
>> int greatest_modulus;
>>
>> /*
>> * Compare greatest modulus of hash partition bound which
>> * is the last element of datums array.
>> */
>> if (b1->datums[b1->ndatums - 1][0] != b2->datums[b2->ndatums - 1][0])
>> return false;
>>
>> /* Compare indexes */
>> greatest_modulus = DatumGetInt32(b1->datums[b1->ndatums - 1][0]);
>> for (i = 0; i < greatest_modulus; i++)
>> if (b1->indexes[i] != b2->indexes[i])
>> return false;
>> }
>> if we return true from where this block ends, we will save one indenation level
>> for rest of the code and also FWIW extra diffs in this patch because of this
>> indentation change.
>>
>
> I still do believe having this code in the IF - ELSE block will be
> better for longterm, rather having code clutter to avoid diff that
> unpleasant for now.

Ok, I will leave it to the committer to judge.

Comments on the tests
+#ifdef USE_ASSERT_CHECKING
+ {
+ /*
+ * Hash partition bound stores modulus and remainder at
+ * b1->datums[i][0] and b1->datums[i][1] position respectively.
+ */
+ for (i = 0; i < b1->ndatums; i++)
+ Assert((b1->datums[i][0] == b2->datums[i][0] &&
+ b1->datums[i][1] == b2->datums[i][1]));
+ }
+#endif
Why do we need extra {} here?

Comments on testcases
+CREATE TABLE hpart_1 PARTITION OF hash_parted FOR VALUES WITH
(modulus 8, remainder 0);
+CREATE TABLE fail_part (LIKE hpart_1 INCLUDING CONSTRAINTS);
+ALTER TABLE hash_parted ATTACH PARTITION fail_part FOR VALUES WITH
(modulus 4, remainder 0);
Probably you should also test the other-way round case i.e. create modulus 4,
remainder 0 partition and then try to add partitions with modulus 8, remainder
4 and modulus 8, remainder 0. That should fail.

Why to create two tables hash_parted and hash_parted2, you should be able to
test with only a single table.

+INSERT INTO hpart_2 VALUES (3, 'a');
+DELETE FROM hpart_2;
+INSERT INTO hpart_5_a (a, b) VALUES (6, 'a');
This is slightly tricky. On different platforms the row may map to different
partitions depending upon how the values are hashed. So, this test may not be
portable on all the platforms. Probably you should add such testcases with a
custom hash operator class which is identity function as suggested by Robert.
This also applies to the tests in insert.sql and update.sql for partitioned
table without custom opclass.

+-- delete the faulting row and also add a constraint to skip the scan
+ALTER TABLE hpart_5 ADD CONSTRAINT hcheck_a CHECK (a IN (5)), ALTER a
SET NOT NULL;
The constraint is not same as the implicit constraint added for that partition.
I am not sure whether it's really going to avoid the scan. Did you verify it?
If yes, then how?

+ALTER TABLE hash_parted2 ATTACH PARTITION fail_part FOR VALUES WITH
(modulus 3, remainder 2);
+ERROR: every hash partition modulus must be a factor of the next
larger modulus
We should add this test with at least two partitions in there so that we can
check lower and upper modulus. Also, testing with some interesting
bounds discussed earlier
in this mail e.g. adding modulus 15 when 5, 10, 60 exist will be better than
testing with 3, 4 and 8.

+ERROR: cannot use collation for hash partition key column "a"
This seems to indicate that we can not specify collation for hash partition key
column, which isn't true. Column a here can have its collation. What's not
allowed is specifying collation in PARTITION BY clause.
May be reword the error as "cannot use collation for hash partitioning". or
plain "cannot use collation in PARTITION BY clause for hash partitioning".

+ERROR: invalid bound specification for a list partition
+LINE 1: CREATE TABLE fail_part PARTITION OF list_parted FOR VALUES W...
+ ^
Should the location for this error be that of WITH clause like in case of range
and list partitioned table.

+select tableoid::regclass as part, a from hash_parted order by part;
May be add a % 4 to show clearly that the data really goes to the partitioning
with that remainder.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
0002-extras.patch text/x-patch 3.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-05-17 14:08:47 Re: [COMMITTERS] pgsql: Preventive maintenance in advance of pgindent run.
Previous Message Magnus Hagander 2017-05-17 13:14:37 Re: [COMMITTERS] pgsql: Tag refs/tags/REL_10_BETA1 was created