Re: [POC] hash partitioning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
Cc: David Steele <david(at)pgmasters(dot)net>, amul sul <sulamul(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-04-26 20:12:21
Message-ID: CA+Tgmobx=P9JLARF75EDM=wEGxyC6oXJSOafMUpLu0f5F9mszw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 20, 2017 at 4:27 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> So, I think we really need something like the syntax in Amul's patch
> in order for this to work at all. Of course, the details can be
> changed according to what seems best but I think the overall picture
> is about right.

I spent some time today looking at these patches. It seems like there
is some more work still needed here to produce something committable
regardless of which way we go, but I am inclined to think that Amul's
patch is a better basis for work going forward than Nagata-san's
patch. Here are some general comments on the two patches:

- As noted above, the syntax implemented by Amul's patch allows us to
know the final partition constraint right away. Nagata-san's proposed
syntax does not do that. Also, Amul's syntax allows for a way to
split partitions (awkwardly, but we can improve it later);
Nagata-san's doesn't provide any method at all.

- Amul's patch derives the hash function to be used from the relevant
hash opclass, whereas Nagata-san's patch requires the user to specify
it explicitly. I think that there is no real use case for a user
providing a custom hash function, and that using the opclass machinery
to derive the function to be used is better. If a user DOES want to
provide their own, they can always create a custom opclass with the
appropriate support function and specify that it should be used when
creating a hash-partitioned table, but most users will be happy for
the system to supply the appropriate function automatically.

- In Nagata-san's patch, convert_expr_for_hash() looks up a function
called "abs" and an operator called "%" by name, which is not a good
idea. We don't want to just find whatever is in the current search
path; we want to make sure we're using the system-defined operators
that we intend to be using. Amul's patch builds the constraint using
a hard-coded internal function OID, F_SATISFIES_HASH_PARTITION.
That's a lot more robust, and it's also likely to be faster because,
in Amul's patch, we only call one function at the SQL level
(satisfies_hash_partition), whereas in Nagata-san's patch, we'll end
up calling three (abs, %, =). Nagata-san's version of
get_qual_for_hash is implicated in this problem, too: it's looking up
the operator to use based on the operator name (=) rather than the
opclass properties. Note that the existing get_qual_for_list() and
get_qual_for_range() use opclass properties, as does Amul's patch.

- Nagata-san's patch only supports hash partitioning based on a single
column, and that column must be NOT NULL. Amul's patch does not have
these restrictions.

- Neither patch contains any documentation updates, which is bad.
Nagata-san's patch also contains no regression tests. Amul's patch
does, but they need to be rebased, since they no longer apply, and I
think some other improvements are possible as well. It's probably not
necessary to re-test things like whether temp and non-temp tables can
be mixed within a partitioning hierarchy, but there should be tests
that tuple routing actually works. The case where it fails because no
matching partition exists should be tested as well. Also, the tests
should validate not only that FOR VALUES isn't accept when creating a
hash partition (which they do) but also that WITH (...) isn't accepted
for a range or list partition (which they do not).

- When I try to do even something pretty trivial with Nagata-san's
patches, it crashes:

rhaas=# create table foo (a int, b text) partition by hash (a)
partitions 7 using hashint4;
CREATE TABLE
rhaas=# create table foo1 partition of foo;
<server crash>

The ruleutils.c support in Nagata-san's patch is broken. If you
execute the non-crashing statement from the above example and then run
pg_dump, it doesn't dump "partitions 7 using hashint4", which means
that the syntax in the dump is invalid.

- Neither patch does anything about the fact that constraint exclusion
won't work for hash partitioning. I mentioned this issue upthread in
the last paragraph of
http://postgr.es/m/CA+Tgmob7RsN5A=ehgYbLPx--c5CmptrK-dB=Y-v--o+TKyfteA@mail.gmail.com
and I think it's imperative that we fix it in some way before we think
about committing any of this. I think that needs to be done by
extending relation_excluded_by_constraints() to have some specific
smarts about hash partitioning, and maybe other kinds of partitioning
as well (because it could probably be made much faster for list and
range partitioning, too).

- Amul's patch should perhaps update tab completion support: create
table foo1 partition of foo <tab> completes with "for values", but now
"with" will be another option.

- Amul's patch probably needs to validate the WITH () clause more
thoroughly. I bet you get a not-very-great error message if you leave
out "modulus" and no error at all if you leave out "remainder".

This is not yet a detailed review - I may be missing things, and
review and commentary from others is welcome. If there is no major
disagreement with the idea of moving forward using Amul's patch as a
base, then I will do a more detailed review of that patch (or,
hopefully, an updated version that addresses the above comments).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-04-26 21:05:39 Re: Unportable implementation of background worker start
Previous Message Konstantin Knizhnik 2017-04-26 19:35:21 Re: Cached plans and statement generalization