Re: Add --include-table-data-where option to pg_dump, to export only a subset of table data

From: Carter Thaxton <carter(dot)thaxton(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Surafel Temesgen <surafel3000(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>, Stephen Frost <sfrost(at)snowman(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add --include-table-data-where option to pg_dump, to export only a subset of table data
Date: 2018-07-02 22:11:46
Message-ID: CAGiT_HMC-0O1KOOKRZk4bfooxdZJS1djHf8EJFAKxG4n0UL34g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The whole reason for the colon in the --where option is to indicate which
table the WHERE clause should refer to, so that one can dump less than all
of the rows.
The --table option is totally different. It specifies which tables to dump
at all.

If I provide a --where option, and no --table option, I want the WHERE
clause to apply to the given table, and otherwise dump all tables.
If one supplies a --table option, it won't dump all tables - it will only
dump the one specified. I don't want to have to specify all the tables
with --table, just to use the --where option.

Also, there may be some misunderstanding about "foo:bar" above. That's an
example of using a namespaced table, where "bar" is a table in the
namespace "foo". Normally, assuming your table is named "bar" in the
default namespace, you would just say something like:

pg_dump --where "bar:created_at >= 2018-05-01'"

On Mon, Jul 2, 2018 at 11:27 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Jun 29, 2018 at 8:09 AM, Surafel Temesgen <surafel3000(at)gmail(dot)com>
> wrote:
> > hey,
> > i am reviewing this patch
> > On Thu, May 31, 2018 at 4:49 AM, Carter Thaxton <
> carter(dot)thaxton(at)gmail(dot)com>
> > wrote:
> >>
> >>
> >> pg_dump --where '"foo:bar":created_at >= '2018-05-01'" dbname
> >
> > it would be more sqlish if it specified like:
> > --table=foo --where ="bar created_at >= 2018-05-01"
> > and i don't like the idea of duplicating the existing --table behavior it
> > may confuse user
> > i rather recommend extending it. And when i test it with --table option
> the
> > content of dump
> > file depend on the option specified first.
>
> But you can specify multiple tables. You wouldn't want the same WHERE
> clause to apply to all of them.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nico Williams 2018-07-02 22:12:55 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Andrew Dunstan 2018-07-02 21:49:06 Re: [HACKERS] Small improvement to compactify_tuples