Re: Dump/Restore of non-default PKs

From: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Dump/Restore of non-default PKs
Date: 2022-04-19 16:13:52
Message-ID: CANbhV-FWuUEkKoxi+9jnCcQPMLzTP-QTqWixnf=UFxabESK=Tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 18 Apr 2022 at 22:05, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com> wrote:
>
> On Mon, 18 Apr 2022 at 21:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> > > On Mon, Apr 18, 2022 at 1:00 PM Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
> > > wrote:
> > >> I propose that we change pg_dump so that when it creates a PK it does
> > >> so in 2 commands:
> > >> 1. CREATE [UNIQUE] INDEX iname ...
> > >> 2. ALTER TABLE .. ADD PRIMARY KEY USING INDEX iname;
> >
> > > Why not just get rid of the limitation that constraint definitions don't
> > > support non-default methods?
> >
> > That approach would be doubling down on the assumption that we can always
> > shoehorn more custom options into SQL-standard constraint clauses, and
> > we'll never fall foul of shift/reduce problems or future spec additions.
> > I think for example that USING INDEX TABLESPACE is a blot on humanity,
> > and I'd be very glad to see pg_dump stop using it in favor of doing
> > things as Simon suggests.
>
> Sigh, agreed. It's more work, but its cleaner in the longer term to
> separate indexes from constraints.
>
> I'll look in more detail and come back here later.
>
> Thanks both.

My original plan was to get pg_dump to generate

--
-- Name: foo foo_a_idx; Type: CONSTRAINT; Schema: public; Owner: postgres
--
CREATE UNIQUE INDEX foo_a_idx ON public.foo USING btree (a);
ALTER TABLE ONLY public.foo
ADD CONSTRAINT foo_a_idx PRIMARY KEY USING INDEX foo_a_idx;

so the index definition is generated as a CONSTRAINT, not an INDEX.

Separating things a bit more generates this output, which is what I
think we want:

--
-- Name: foo foo_a_idx; Type: CONSTRAINT; Schema: public; Owner: postgres
--
ALTER TABLE ONLY public.foo
ADD CONSTRAINT foo_a_idx PRIMARY KEY USING INDEX foo_a_idx;
--
-- Name: foo_a_idx; Type: INDEX; Schema: public; Owner: postgres
--
CREATE UNIQUE INDEX foo_a_idx ON public.foo USING btree (a);

Which is better, but there is still some ugly code for REPLICA
IDENTITY and CLUSTER duplicated in dumpIndex() and dumpConstraint().

The attached patch includes a change to pg_dump_sort.c which changes
the priority of CONSTRAINT, but that doesn't seem to have any effect
on the output. I'm hoping that's a quick fix, but I haven't seen it
yet, even after losing sanity points trying to read the priority code.

Anyway, the main question is how should the code be structured?

--
Simon Riggs http://www.EnterpriseDB.com/

Attachment Content-Type Size
pg_dump_constraint_using_index.v1.patch application/octet-stream 9.6 KB
pg_dump_test_setup.sql application/octet-stream 241 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Wong 2022-04-19 16:28:56 Re: GSoC: Database Load Stress Benchmark (2022)
Previous Message Robert Haas 2022-04-19 16:13:51 Re: Add --{no-,}bypassrls flags to createuser