Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function

From: Morris de Oryx <morrisdeoryx(at)gmail(dot)com>
To: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function
Date: 2023-06-05 11:58:20
Message-ID: CAKqncci1GoprLGtcSLk4ytPBNeQVW9Ab=CyJUdbiUkWy28WN8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks to Imseih and Sami at AWS for reporting this. The original case
comes from an upgrade I've been trying to complete for a couple of months
now, since RDS started supporting 15 with a 15.2 release.

The process has been slow and painful because, originally, there was a bug
on the RDS side that stopped any of the upgrade logs from appearing in RDS
or CloudWatch. Now, minimal log errors are shown, but not much detail.

I think that BEGIN ATOMIC is the sleeper feature of Postgres 14. It is
a *fantastic
*addition to the dependency-tracking system. However, it does not seem to
work.

I just found this thread now looking for the string warning: could not
resolve dependency loop among these items. I got that far by setting up a
new database, and a simple test case that reproduces the problem. I called
the test database ba for Begin Atomic:

------------------------------------
-- New database
------------------------------------
CREATE DATABASE ba;

------------------------------------
-- Connect
------------------------------------
-- Connect to new database...

------------------------------------
-- Setup schemas
------------------------------------
CREATE SCHEMA data; -- I don't have public running, so create a schema.

-- All of my UPSERT view and function plumbing is tucked away here:
CREATE SCHEMA types_plus;

------------------------------------
-- Define table
------------------------------------
DROP TABLE IF EXISTS data.test_event;

CREATE TABLE IF NOT EXISTS data.test_event (
id uuid NOT NULL DEFAULT NULL PRIMARY KEY,

ts_dts timestamp NOT NULL DEFAULT 'epoch',

who text NOT NULL DEFAULT NULL,
what text NOT NULL DEFAULT NULL
);

-- PK is created by default as test_event_pkey, used in ON CONFLICT later.

------------------------------------
-- Create view, get type for free
------------------------------------
CREATE VIEW types_plus.test_event_v1 AS

SELECT
id,
ts_dts,
who,
what

FROM data.test_event;

-- Create a function to accept an array of rows formatted as test_event_v1
for UPSERT into test_event.
DROP FUNCTION IF EXISTS types_plus.insert_test_event_v1
(types_plus.test_event_v1[]);

CREATE OR REPLACE FUNCTION types_plus.insert_test_event_v1 (data_in
types_plus.test_event_v1[])

RETURNS void

LANGUAGE SQL

BEGIN ATOMIC

INSERT INTO data.test_event (
id,
ts_dts,
who,
what)

SELECT
rows_in.id,
rows_in.ts_dts,
rows_in.who,
rows_in.what

FROM unnest(data_in) as rows_in

ON CONFLICT ON CONSTRAINT test_event_pkey DO UPDATE SET
id = EXCLUDED.id,
ts_dts = EXCLUDED.ts_dts,
who = EXCLUDED.who,
what = EXCLUDED.what;

END;

I've tested pg_dump with the plain, custom, directory, and tar options. All
report the same problem:

Note that I'm using Postgres and pg_dump 14.8 locally, RDS is still at 14.7
of Postgres and presumably 14.7 of pg_dump*. *

pg_dump: warning: could not resolve dependency loop among these items:
pg_dump: FUNCTION insert_test_event_v1 (ID 224 OID 1061258)
pg_dump: CONSTRAINT test_event_pkey (ID 3441 OID 1061253)
pg_dump: POST-DATA BOUNDARY (ID 3584)
pg_dump: TABLE DATA test_event (ID 3582 OID 1061246)
pg_dump: PRE-DATA BOUNDARY (ID 3583)

Hunting around earlier, I found a thread here from 2020 that mentioned
that BEGIN
ATOMIC was going to make dependency resolution tougher for pg_dump. Makes
sense, it can become circular or ambiguous in a hurry. However, in my case,
I don't see that the dependencies are any kind of crazy spaghetti. I have
hundreds of tables with the same pattern of dependencies for UPSERT work:

1. CREATE TABLE foo
2. CREATE PK foo_pk and other constraints.
3. CREATE VIEW foo_v1 (I could use CREATE TYPE here, for my purposes, but
prefer CREATE VIEW.)
4. CREATE FUNCTION insert_foo_v1 (foo_v1[])

>
The example I listed earlier is a simplified version of this. I didn't even
check that the new database works, that's not important....I am only trying
to check out pg_dump/pg_restore.

Can anyone suggest a path forward for me with the upgrade to PG 15? I'm
waiting on that as we need to use MERGE and I'd like other PG 15
improvements, like the sort optimizations. As far as I can see it, my best
bet is to

1. Delete all of my routines with BEGIN ATOMIC. That's roughly 250 routines.
2. Upgrade.
3. Add back in the routines in PG 15.

That likely would work for me as my dependencies are shallow and not
circular. They simply require a specific order. I avoid chaining views of
views and functions off functions as a deliberate practice in Postgres.

Down the track, does my sort of dependency problem seem resolvable by
pg_dump? I've got my own build-the-system-from-scratch system that use for
local testing out of the source files, and I had to resort to hinting files
to inject some things in the correct order. So, I'm not assuming that it
*is* possible for pg_dump to resolve all sequences. Then again, all of this
could go away if DDL dependency checking were deferrable. But, I'm just a
Postgres user, not a C-coder.

Thanks for looking at this bug, thanks again for the AWS staff for posting
it, and thanks for any suggestions on my day-to-day problem of upgrading.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Morris de Oryx 2023-06-05 12:13:16 Re: [BUG] pg_dump does not properly deal with BEGIN ATOMIC function
Previous Message 蔡梦娟 (玊于) 2023-06-05 11:44:05 Fix missing initialization of delayChkptEnd