Re: pg_dump is broken for partition tablespaces

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_dump is broken for partition tablespaces
Date: 2019-04-17 21:39:01
Message-ID: 20190417213901.GA6555@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2019-Apr-17, David Rowley wrote:

> On Mon, 15 Apr 2019 at 15:26, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> >
> > On 2019-Apr-15, David Rowley wrote:
> >
> > > To be honest, if I'd done a better job of thinking through the
> > > implications of this tablespace inheritance in ca4103025d, then I'd
> > > probably have not bothered submitting a patch for it. We could easily
> > > revert that, but we'd still be left with the same behaviour in
> > > partitioned indexes, which is in PG11.
> >
> > Well, I suppose if we do decide to revert it for tables, we should do it
> > for both tables and indexes. But as I said, I'm not yet convinced that
> > this is the best way forward.
> Ok. Any ideas or suggestions on how we move on from here? It seems
> like a bit of a stalemate.

Well, here's my proposed patch. I'm now fairly happy with how this
looks now, concerning partitioned tables.

This is mostly what was already discussed:

1. pg_dump now uses regular CREATE TABLE followed by ALTER TABLE / ATTACH
PARTITION when creating partitions, rather than CREATE TABLE
PARTITION OF. pg_dump --binary-upgrade was already doing that, so
this part mostly removes some code. In order to make the partitions
reach the correct tablespace, the "default_tablespace" GUC is used.
No TABLESPACE clause is added to the dump. This is David's patch
near the start of the thread.

2. When creating a partition using the CREATE TABLE PARTITION OF syntax,
the TABLESPACE clause has highest precedence; if that is not given,
the partitioned table's tablespace is used; if that is set to 0 (the
default), default_tablespace is used; if that's set to empty or a
nonexistant tablespace, the database's default tablespace is used.
This is (I think) what Andres proposed in

3. Partitioned relations can have the database tablespace in
pg_class.reltablespace, as opposed to storage-bearing relations which
cannot. This is useful to be able to put partitions in the database
tablespace even if the default_tablespace is set to something else.

4. For partitioned tables, ALTER TABLE .. SET TABLESPACE DEFAULT is
available as suggested by David, which makes future partition
creations target default_tablespace or the database's tablespace.

5. Recreating indexes during table-rewriting ALTER TABLE resulted in
broken indexes. We already had some adhesive tape in place to make
that work for regular indexes (commit bd673e8e864a); my approach to
fix it for partitioned indexes is to temporarily reset
default_tablespace to empty.

As for Tom's question in :

> It's possible that Alvaro's patch is also broken, but I haven't had time
> to review it. The immediate question is what happens when somebody makes
> a partitioned table in template1 and then does CREATE DATABASE with a
> tablespace option. Does the partitioned table end up in the same
> tablespace as ordinary tables do?

Note that partitioned don't have any files, so they don't end up
anywhere; when a partition is created, the target tablespace is
determined using four rules instead of three (see #2 above) so yes, they
do end up in the same places as ordinary tables. Note that even if you
do put a partitioned table in some tablespace, you will not later run
afoul of this check:
errmsg("cannot assign new default tablespace \"%s\"",
errdetail("There is a conflict because database \"%s\" already has some tables in this tablespace.",

because that check uses ReadDir() and raise an error if any entry is
found; but partitioned tables don't have files in directory, so nothing
happens. (Of course, it will hit if you have a partition in that
tablespace, but that's also the case for regular tables.)

(I propose to commit both 0002 and 0003 as a single unit that fixes the
whole problem, rather than attacking backend and pg_dump separately.
0001 appears logically separate and I would push on its own.)

Álvaro Herrera
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v3-0001-psql-display-tablespace-for-partitioned-indexes.patch text/x-diff 2.6 KB
v3-0002-Make-pg_dump-emit-ATTACH-PARTITION-instead-of-PAR.patch text/x-diff 9.8 KB
v3-0003-fix-behavior-to-consider-default_tablespace.patch text/x-diff 44.6 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-04-17 21:46:09 Re: pgsql: Fix unportable code in pgbench.
Previous Message Tom Lane 2019-04-17 21:30:36 pgsql: Fix unportable code in pgbench.