Add --include-table-data-where option to pg_dump, to export only a subset of table data

From: Carter Thaxton <carter(dot)thaxton(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Add --include-table-data-where option to pg_dump, to export only a subset of table data
Date: 2018-05-20 23:48:44
Message-ID: CAGiT_HNav5B=OfCdfyFoqTa+oe5W1vG=PXkTETCxXg4kcUTktA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Many times I've wanted to export a subset of a database, using some sort of
row filter condition on some of the large tables. E.g. copying a
production database to a staging environment, but with some time series
data only from the past month.

We have the existing options:
--include-table=table (and its -t synonym)
--exclude-table=table
--exclude-table-data=table

I propose a new option:
--include-table-data-where=table:filter_clause

One would use this option as follows:

pg_dump --include-table-data-where=largetable:"created_at >=
'2018-05-01'" database_name

The filter_clause is used as the contents of a WHERE clause when querying
the data to generate the COPY statement produced by pg_dump.

I've prepared a proposed patch for this, which is attached. The code
changes are rather straightforward. I did have to add the ability to carry
around an extra pointer-sized object to the simple_list implementation, in
order to allow the filter clause to be associated to the matching oids of
the table pattern. It seemed the best way to augment the existing
simple_list implementation, but change as little as possible elsewhere in
the codebase. (Note that SimpleOidList is actually only used by pg_dump).

Feel free to review and propose any amendments.

Attachment Content-Type Size
pgdump-include-table-data-where-v1.patch application/octet-stream 12.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-05-21 00:30:05 Allowing printf("%m") only where it actually works
Previous Message Stephen Frost 2018-05-20 22:56:28 Re: Fix for FETCH FIRST syntax problems