Re: Add --include-table-data-where option to pg_dump, to export only a subset of table data

From: Carter Thaxton <carter(dot)thaxton(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add --include-table-data-where option to pg_dump, to export only a subset of table data
Date: 2018-05-31 01:49:06
Message-ID: CAGiT_HP-mjeEFhYDPtbc0oW9cttk+-G3oOFjA0sVUj_MSbCMkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

After some consideration, I've created a new patch that addresses even more
of the various concerns.

Most notably, the command-line option is shortened to simply --where, which
is much easier to read and understand,
and matches the earlier proposal [1].

> bool
> -simple_oid_list_member(SimpleOidList *list, Oid val)
> +simple_oid_list_member2(SimpleOidList *list, Oid val, void **extra_data)
>
> I feel like that isn't in the spirit of Lisp "member". It's now a
> kind of association list.

My new patch has this function named simple_oid_list_find_data, to indicate
that it's working with some extra data,
and the corresponding append is called simple_oid_list_append_data.

> Another problem is that your spec does not show
> us how you would handle tables like Foo.Bar or "foo:bar" (colon have
> to be escaped)?

This is now addressed in the v4 patch. I've added a new function to
string_utils,
called findUnquotedChar, which is used to find the colon character in the
command-line argument,
which separates the table name from the filter clause.

For example, if you have a table called "foo:bar", then you would use the
--where option as follows:

pg_dump --where '"foo:bar":created_at >= '2018-05-01'" dbname

If you have a table with a double-quote in the name (gah!), as well as
colons, like: has"quote:and:colons
then this still works, because such quotes in the name are escaped by
doubling the quote char,
e.g. "has""quote:and:colons", and also works with this patch:

pg_dump --where '"has""quote:and:colons":created_at >= '2018-05-1'" dbname

[1] https://ci.appveyor.com/project/postgresql-cfbot/
postgresql/build/1.0.311

Attachment Content-Type Size
pgdump-include-table-data-where-v4.patch application/octet-stream 11.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-05-31 02:26:23 Re: I'd like to discuss scaleout at PGCon
Previous Message David Rowley 2018-05-31 01:34:19 Re: New GUC to sample log queries