Re: Declarative partitioning

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Declarative partitioning
Date: 2016-02-16 02:07:57
Message-ID: CADkLM=foVFUb6snTLSvAFZWBCGpT3q20fi9hbsFYyVoqFTcijw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Also, you won't see any optimizer and executor changes. Queries will still
> use the same plans as existing inheritance-based partitioned tables,
> although as I mentioned, constraint exclusion won't yet kick in. That will
> be fixed very shortly.
>
> And of course, comments on syntax are welcome as well.
>
> Thanks,
> Amit
>
>
>
Good to know the current limitations/expectations.

Our ETL has a great number of workers that do something like this:
1. grab a file
2. based on some metadata of that file, determine the partition that that
would receive ALL of the rows in that file. It's actually multiple tables,
all of which are partitioned, all of which fully expect the file data to
fit in exactly one partition.
3. \copy into a temp table
4. Transform the data and insert the relevant bits into each of the target
partitions derived in #2.

So while ATR is a *major* feature of true partitioning, it's not something
we'd actually need in our current production environment, but I can
certainly code it that way to benchmark ATR vs "know the destination
partition ahead of time" vs "insane layered range_partitioning trigger +
pg_partman trigger".

Currently we don't do anything like table swapping, but I've done that
enough in the past that I could probably concoct a test of that too, once
it's implemented.

As for the syntax, I'm not quite sure your patch addresses the concerned I
voiced earlier: specifically if the VALUES IN works for RANGE as well as
LIST, but I figured that would become clearer once I tried to actually use
it. Currently we have partitioning on C-collated text ranges (no, they
don't ship with postgres, I had to make a custom type) something like this:

part0: (,BIG_CLIENT)
part1: [BIG_CLIENT,BIG_CLIENT]
part2: (BIG_CLIENT,L)
part3: [L,MONSTROUSLY_BIG_CLIENT)
part4: [MONSTROUSLY_BIG_CLIENT,MONSTROUSLY_BIG_CLIENT]
part5: (MONSTROUSLY_BIG_CLIENT,RANDOM_CLIENT_LATE_IN_ALPHABET]
part6: (RANDOM_CLIENT_LATE_IN_ALPHABET,)

I can't implement that with a simple VALUES LESS THAN clause, unless I
happen to know 'x' in 'BIG_CLIENTx', where 'x' is the exact first character
in the collation sequence, which has to be something unprintable, and that
would make those who later read my code to say something unprintable. So
yeah, I'm hoping there's some way to cleanly represent such ranges.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2016-02-16 02:09:16 Re: Remove or weaken hints about "effective resolution of sleep delays is 10 ms"?
Previous Message Amit Langote 2016-02-16 01:39:27 Re: [PROPOSAL] VACUUM Progress Checker.