From: | Attila Soki <pgsql(at)attilasoki(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: WIP Patch: pg_dump structured |
Date: | 2023-03-23 15:34:05 |
Message-ID: | 2F0C38DB-A665-4D50-9CEF-1395993C7610@attilasoki.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12 Mar 2023, at 22:56, Attila Soki <pgsql(at)attilasoki(dot)com> wrote:
>> On 12 Mar 2023, at 21:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Won't this fail completely with SQL objects whose names aren't suitable
>> to be pathname components? "A/B" is a perfectly good name so far as
>> SQL is concerned. You could also have problems with collisions on
>> case-insensitive filesystems.
>
> You are right about the case-insensivity, this is not handled and will fail. I forgot
> to handle that. I trying to find a way to handle this.
Hi Tom,
Thank you for your feedback.
This is an updated version of the pg_dump structured wip patch (V2) with the
following changes:
- to avoid path collisions on case insensitive filessystems, all path components
created from user input are suffixed with the hex representation of a 32 bit
hash. “A/B” and “a/b” will get different suffixes.
- all path components are now filesystem safe
All this is a proposal, if you know a better solution please let me know.
This patch is a WIP (V2). The patch is against master and it compiles
successfully on macOS 13.2.1 aarch64 and on Debian 11 arm64.
To test, execute pg_dump --format=structured --file=/path/to/outputdir dbname
>
>>
>>> This format can be restored by feeding its plaintext toc file (restore-dump.sql)
>>> to psql. The output is also suitable for manipulating the files with standard
>>> editing tools.
>>
>> This seems a little contradictory: if you want to edit the individual
>> files, you'd have to also update restore-dump.sql, or else it's pointless.
>> It might make more sense to consider this as a write-only dump format
>> and not worry about whether it can be restored directly.
>
> The main motivation was to track changes with VCS at the file (object) level,
> editing small files was intended as a second possible use case.
> I did not know that a write-only format would go.
Declaring this format as a write-only dump would allow a more flexible directory
structure since we wouldn't have to maintain the restore order.
>
>>
>>> What do you think of this feature, any chance it will be added to pg_dump once
>>> the patch is ready?
>>
>> I'm not clear on how big the use-case is. It's not really obvious to
>> me that this'd have any benefit over the existing plain-text dump
>> capability. You can edit those files too, at least till the schema
>> gets too big for your editor. (But if you've got many many thousand
>> SQL objects, a file-per-SQL-object directory will also be no fun to
>> deal with.)
Here is a sample use case to demonstrate how this format could be used to track
schema changes with git. The main difference from using the existing plain-text
schema dump is, that this format makes it possible to keep a history of the
actual changes made to the individual objects. For example, to determine which
migrations have changed the foo function.
# import the schema into the repository
cd /path/to/my_app_code
pg_dump --format=structured --schema-only --file=foo_schema foodb
git add foo_schema --all
git commit foo_schema -m'initial commit foo_schema'
# make changes in the db
(my_app migrate foodb)
(psql foodb < tweak.sql)
# get a fresh dump
rm -rf foo_schema
pg_dump --format=structured --schema-only --file=foo_schema foodb
# now inspect the changes under foo_schema: there may be changed, new and
# missing files
git status foo_schema
# commit all schema changes
git add foo_schema -u
git commit foo_schema -m'changes from migration foodb'
# later, inspect changes
git log --stat
# show the history of one object
git log -p -- "foo_schema/path/to/FUNCTIONS/foo.sql"
Sure, the user base for this is narrow.
Thanks for any feedback.
—
Best regards
Attila Soki
Attachment | Content-Type | Size |
---|---|---|
v2-wip-pg_dump_structured.patch | application/octet-stream | 31.1 KB |
unknown_filename | text/plain | 3 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2023-03-23 15:41:52 | Re: HOT chain validation in verify_heapam() |
Previous Message | Peter Eisentraut | 2023-03-23 15:33:13 | Re: Schema variables - new implementation for Postgres 15 |